EMU ABUNDANCE

Generate relative abundance estimates from ONT, Pac-Bio or short 16S reads using emu.

URL: https://github.com/treangenlab/emu

Example

This wrapper can be used in the following way:

rule abundance:
    input:
        reads="{sample}.fa",
        db="database",
    output:
        abundances="{sample}_rel-abundance.tsv",
        alignments="{sample}_emu_alignments.sam",
        unclassified="{sample}_unclassified.fa",
    log:
        "logs/emu/{sample}_abundance.log",
    params:
        extra="--type map-ont --keep-counts",
    threads: 3  # optional, defaults to 1
    wrapper:
        "v3.9.0-14-g476823b/bio/emu/abundance"


rule abundance_paired:
    input:
        reads=["{sample}_R1.fq", "{sample}_R2.fq"],
        db="database",
    output:
        abundances="{sample}_rel-abundance_paired.tsv",
    log:
        "logs/emu/{sample}_abundance_paired.log",
    params:
        extra="--type sr --keep-counts",
    threads: 3  # optional, defaults to 1
    wrapper:
        "v3.9.0-14-g476823b/bio/emu/abundance"

Note that input, output and log file paths can be chosen freely.

When running with

snakemake --use-conda

the software dependencies will be automatically deployed into an isolated environment before execution.

Software dependencies

emu=3.4.5

Input/Output

Input:

reads: single fastq file or paired fastq files
db: emu database (optional; check documentation for pre-built databases and how to build them).

Output:

abundances: TSV with relative (and optionally, absolute abundances).
alignments: SAM file with the alignments (optional).
unclassified: FASTA file with unclassified sequences (optional).

Params

extra: Any optimal parameter such as –type (sequencer) or –min-abundance. Optional flags involving output are handled automatically (e.g. –output-dir, –output-basename …)

Authors

Curro Campuzano

Code

__author__ = "Curro Campuzano Jimenez"
__copyright__ = "Copyright 2024, Curro Campuzano Jimenez"
__email__ = "campuzanocurro@gmail.com"
__license__ = "MIT"


from snakemake.shell import shell
import tempfile
import os

log = snakemake.log_fmt_shell(stdout=True, stderr=True)
extra = snakemake.params.get("extra", "")
db = snakemake.input.get("db", "")
if db:
    db = f"--db {db}"

with tempfile.TemporaryDirectory() as tmpdir:
    shell(
        "emu abundance {snakemake.input.reads} {db}"
        " --keep-files --output-dir {tmpdir}"
        " --output-basename output --output-unclassified"
        " --threads {snakemake.threads}"
        " {extra}"
        " {log}"
    )
    if out_tsv := snakemake.output.get("abundances"):
        shell("mv {tmpdir}/output_rel-abundance.tsv {out_tsv}")
    if out_sam := snakemake.output.get("alignments"):
        shell("mv {tmpdir}/output_emu_alignments.sam {out_sam}")
    if out_fa := snakemake.output.get("unclassified"):
        shell("mv {tmpdir}/output_unclassified.fa {out_fa}")