BLAST MAKEBLASTDB FOR FASTA FILES

Makeblastdb produces local BLAST databases from nucleotide or protein FASTA files. For more information please see BLAST documentation.

URL:

Example

This wrapper can be used in the following way:

rule blast_makedatabase_nucleotide:
    input:
        fasta="genome/{genome}.fasta"
    output:
        multiext("results/{genome}.fasta",
            ".ndb",
            ".nhr",
            ".nin",
            ".not",
            ".nsq",
            ".ntf",
            ".nto"
        )
    log:
        "logs/{genome}.log"
    params:
        "-input_type fasta -blastdb_version 5 -parse_seqids"
    wrapper:
        "v1.1.0/bio/blast/makeblastdb"

rule blast_makedatabase_protein:
    input:
        fasta="protein/{protein}.fasta"
    output:
        multiext("results/{protein}.fasta",
            ".pdb",
            ".phr",
            ".pin",
            ".pot",
            ".psq",
            ".ptf",
            ".pto"
        )
    log:
        "logs/{protein}.log"
    params:
        "-input_type fasta -blastdb_version 5"
    wrapper:
        "v1.1.0/bio/blast/makeblastdb"

Note that input, output and log file paths can be chosen freely.

When running with

snakemake --use-conda

the software dependencies will be automatically deployed into an isolated environment before execution.

Software dependencies

  • blast==2.11.0

Input/Output

Input:

  • FASTA file

Output:

  • multiple files with different extensions (e.g. .nin, .nsq, .nhr for nucleotides or .pin, .psq, .phr for proteins)

Authors

Code

__author__ = "Antonie Vietor"
__copyright__ = "Copyright 2021, Antonie Vietor"
__email__ = "antonie.v@gmx.de"
__license__ = "MIT"

from snakemake.shell import shell
from os import path

log = snakemake.log
out = snakemake.output[0]

db_type = ""
(out_name, ext) = path.splitext(out)

if ext.startswith(".n"):
    db_type = "nucl"
elif ext.startswith(".p"):
    db_type = "prot"

shell(
    "makeblastdb"
    " -in {snakemake.input.fasta}"
    " -dbtype {db_type}"
    " {snakemake.params}"
    " -logfile {log}"
    " -out {out_name}"
)