LASTDB

LAST finds similar regions between sequences, and aligns them. It is designed for comparing large datasets to each other (e.g. vertebrate genomes and/or large numbers of DNA reads)

Software dependencies

  • last=874

Example

This wrapper can be used in the following way:

rule lastdb_transcript:
    input:
        "test-transcript.fa"
    output:
        "test-transcript.fa.prj",
    params:
        protein_input=False,
        extra=""
    log:
        "logs/lastdb/test-transcript.log"
    wrapper:
        "0.65.0/bio/last/lastdb"

rule lastdb_protein:
    input:
        "test-protein.fa"
    output:
        "test-protein.fa.prj",
    params:
        protein_input=True,
        extra=""
    log:
        "logs/lastdb/test-protein.log"
    wrapper:
        "0.65.0/bio/last/lastdb"

Note that input, output and log file paths can be chosen freely. When running with

snakemake --use-conda

the software dependencies will be automatically deployed into an isolated environment before execution.

Authors

    1. Tessa Pierce

Code

__author__ = "N. Tessa Pierce"
__copyright__ = "Copyright 2019, N. Tessa Pierce"
__email__ = "ntpierce@gmail.com"
__license__ = "MIT"

from os import path

from snakemake.shell import shell

extra = snakemake.params.get("extra", "")
log = snakemake.log_fmt_shell(stdout=False, stderr=True)

protein_cmd = ""
protein = snakemake.params.get("protein_input", False)

if protein:
    protein_cmd = " -p "

shell("lastdb {extra} {protein_cmd} -P {snakemake.threads} {snakemake.input} {log}")