FGBIO ANNOTATEBAMWITHUMIS#
Annotates existing BAM files with UMIs (Unique Molecular Indices, aka Molecular IDs, Molecular barcodes) from a separate FASTQ file.
URL: https://fulcrumgenomics.github.io/fgbio/
Example#
This wrapper can be used in the following way:
rule annotate_bam_single_fastq:
input:
bam="mapped/{sample}.bam",
umi="umi/{sample}.fastq",
output:
"mapped/{sample}.annotated.bam",
params:
"",
resources:
# suggestion assuming unsorted input, so that memory should
# be proportional to input size:
# https://fulcrumgenomics.github.io/fgbio/tools/latest/AnnotateBamWithUmis.html
mem_mb=lambda wildcards, input: max([input.size_mb * 1.3, 200]),
log:
"logs/fgbio/annotate_bam/{sample}.log",
wrapper:
"v3.0.2-2-g0dea6a1/bio/fgbio/annotatebamwithumis"
rule annotate_bam_multiple_fastqs:
input:
bam="mapped/{sample}.bam",
umi=[
"umi/{sample}.fastq",
"umi/{sample}.fastq",
],
output:
"mapped/{sample}-{sample}.annotated.bam",
params:
"",
resources:
# suggestion assuming unsorted input, so that memory should
# be proportional to input size:
# https://fulcrumgenomics.github.io/fgbio/tools/latest/AnnotateBamWithUmis.html
mem_mb=lambda wildcards, input: max([input.size_mb * 1.3, 200]),
log:
"logs/fgbio/annotate_bam/{sample}-{sample}.log",
wrapper:
"v3.0.2-2-g0dea6a1/bio/fgbio/annotatebamwithumis"
Note that input, output and log file paths can be chosen freely.
When running with
snakemake --use-conda
the software dependencies will be automatically deployed into an isolated environment before execution.
Software dependencies#
fgbio=2.1.0
snakemake-wrapper-utils=0.6.2
Code#
__author__ = "Patrik Smeds"
__copyright__ = "Copyright 2018, Patrik Smeds"
__email__ = "patrik.smeds@gmail.com"
__license__ = "MIT"
from snakemake.shell import shell
from snakemake.io import Namedlist
from snakemake_wrapper_utils.java import get_java_opts
log = snakemake.log_fmt_shell(stdout=False, stderr=True)
extra_params = snakemake.params.get("extra", "")
java_opts = get_java_opts(snakemake)
bam_input = snakemake.input.bam
if bam_input is None:
raise ValueError("Missing bam input file!")
elif not isinstance(bam_input, str):
raise ValueError("Input bam should be a string: " + str(bam_input) + "!")
umi_input = snakemake.input.umi
if umi_input is None:
raise ValueError("Missing input file with UMIs")
elif not (isinstance(umi_input, str) or isinstance(umi_input, Namedlist)):
raise ValueError(
"Input UMIs-file should be a string or a snakemake io list: "
+ str(umi_input)
+ "!"
)
if not len(snakemake.output) == 1:
raise ValueError("Only one output value expected: " + str(snakemake.output) + "!")
output_file = snakemake.output[0]
if output_file is None:
raise ValueError("Missing output file!")
elif not isinstance(output_file, str):
raise ValueError("Output bam-file should be a string: " + str(output_file) + "!")
shell(
"fgbio {java_opts} AnnotateBamWithUmis"
" -i {bam_input}"
" -f {umi_input}"
" -o {output_file}"
" {extra_params}"
" {log}"
)