FGBIO ANNOTATEBAMWITHUMIS

https://img.shields.io/github/issues-pr/snakemake/snakemake-wrappers/bio/fgbio/annotatebamwithumis?label=version%20update%20pull%20requests

Annotates existing BAM files with UMIs (Unique Molecular Indices, aka Molecular IDs, Molecular barcodes) from a separate FASTQ file.

URL: https://fulcrumgenomics.github.io/fgbio/

Example

This wrapper can be used in the following way:

rule annotate_bam_single_fastq:
    input:
        bam="mapped/{sample}.bam",
        umi="umi/{sample}.fastq",
    output:
        "mapped/{sample}.annotated.bam",
    params:
        "",
    resources:
        # suggestion assuming unsorted input, so that memory should
        # be proportional to input size:
        # https://fulcrumgenomics.github.io/fgbio/tools/latest/AnnotateBamWithUmis.html
        mem_mb=lambda wildcards, input: max([input.size_mb * 1.3, 200]),
    log:
        "logs/fgbio/annotate_bam/{sample}.log",
    wrapper:
        "v3.9.0/bio/fgbio/annotatebamwithumis"


rule annotate_bam_multiple_fastqs:
    input:
        bam="mapped/{sample}.bam",
        umi=[
            "umi/{sample}.fastq",
            "umi/{sample}.fastq",
        ],
    output:
        "mapped/{sample}-{sample}.annotated.bam",
    params:
        "",
    resources:
        # suggestion assuming unsorted input, so that memory should
        # be proportional to input size:
        # https://fulcrumgenomics.github.io/fgbio/tools/latest/AnnotateBamWithUmis.html
        mem_mb=lambda wildcards, input: max([input.size_mb * 1.3, 200]),
    log:
        "logs/fgbio/annotate_bam/{sample}-{sample}.log",
    wrapper:
        "v3.9.0/bio/fgbio/annotatebamwithumis"

Note that input, output and log file paths can be chosen freely.

When running with

snakemake --use-conda

the software dependencies will be automatically deployed into an isolated environment before execution.

Software dependencies

  • fgbio=2.2.1

  • snakemake-wrapper-utils=0.6.2

Authors

  • Patrik Smeds

Code

__author__ = "Patrik Smeds"
__copyright__ = "Copyright 2018, Patrik Smeds"
__email__ = "patrik.smeds@gmail.com"
__license__ = "MIT"


from snakemake.shell import shell
from snakemake.io import Namedlist
from snakemake_wrapper_utils.java import get_java_opts

log = snakemake.log_fmt_shell(stdout=False, stderr=True)
extra_params = snakemake.params.get("extra", "")
java_opts = get_java_opts(snakemake)

bam_input = snakemake.input.bam

if bam_input is None:
    raise ValueError("Missing bam input file!")
elif not isinstance(bam_input, str):
    raise ValueError("Input bam should be a string: " + str(bam_input) + "!")

umi_input = snakemake.input.umi

if umi_input is None:
    raise ValueError("Missing input file with UMIs")
elif not (isinstance(umi_input, str) or isinstance(umi_input, Namedlist)):
    raise ValueError(
        "Input UMIs-file should be a string or a snakemake io list: "
        + str(umi_input)
        + "!"
    )

if not len(snakemake.output) == 1:
    raise ValueError("Only one output value expected: " + str(snakemake.output) + "!")
output_file = snakemake.output[0]


if output_file is None:
    raise ValueError("Missing output file!")
elif not isinstance(output_file, str):
    raise ValueError("Output bam-file should be a string: " + str(output_file) + "!")

shell(
    "fgbio {java_opts} AnnotateBamWithUmis"
    " -i {bam_input}"
    " -f {umi_input}"
    " -o {output_file}"
    " {extra_params}"
    " {log}"
)