PALADIN ALIGN

Align nucleotide reads to a protein fasta file (that has been indexed with paladin index). PALADIN is a protein sequence alignment tool designed for the accurate functional characterization of metagenomes.

URL:

Example

This wrapper can be used in the following way:

rule paladin_align:
    input:
        reads=["reads/reads.left.fq.gz"],
        index="index/prot.fasta.bwt",
    output:
        "paladin_mapped/{sample}.bam" # will output BAM format if output file ends with ".bam", otherwise SAM format
    log:
        "logs/paladin/{sample}.log"
    threads: 4
    wrapper:
        "v1.2.0/bio/paladin/align"

Note that input, output and log file paths can be chosen freely.

When running with

snakemake --use-conda

the software dependencies will be automatically deployed into an isolated environment before execution.

Software dependencies

  • paladin=1.4.4
  • samtools=1.5

Input/Output

Input:

  • nucleotide reads (fastq)
  • indexed protein fasta file (output of paladin index or prepare)

Output:

  • mapped reads (SAM or BAM format)

Authors

    1. Tessa Pierce

Code

"""Snakemake wrapper for PALADIN alignment"""

__author__ = "N. Tessa Pierce"
__copyright__ = "Copyright 2019, N. Tessa Pierce"
__email__ = "ntpierce@gmail.com"
__license__ = "MIT"

from os import path
from snakemake.shell import shell

extra = snakemake.params.get("extra", "")
log = snakemake.log_fmt_shell(stdout=False, stderr=True)

r = snakemake.input.get("reads")
assert (
    r is not None
), "reads are required as input. If you have paired end reads, please merge them first (e.g. with PEAR)"
index = snakemake.input.get("index")
assert (
    index is not None
), "please index your assembly and provide the basename (with'.bwt' extension) via the 'index' input param"

index_base = str(index).rsplit(".bwt")[0]

outfile = snakemake.output

# if bam output, pipe to bam!
output_cmd = "  | samtools view -Sb - > " if str(outfile).endswith(".bam") else " -o "

min_orf_len = snakemake.params.get("f", "250")

shell(
    "paladin align -f {min_orf_len} -t {snakemake.threads} {extra} {index_base} {r} {output_cmd} {outfile}"
)