FGBIO GROUPREADSBYUMI

Groups reads together that appear to have come from the same original molecule.

Example

This wrapper can be used in the following way:

rule GroupReads:
    input:
        "mapped/a.bam"
    output:
        bam="mapped/{sample}.gu.bam",
        hist="mapped/{sample}.gu.histo.tsv",
    params:
        extra="-s adjacency --edits 1"
    log:
        "logs/fgbio/group_reads/{sample}.log"
    wrapper:
        "v1.9.0/bio/fgbio/groupreadsbyumi"

Note that input, output and log file paths can be chosen freely.

When running with

snakemake --use-conda

the software dependencies will be automatically deployed into an isolated environment before execution.

Software dependencies

  • fgbio==0.6.1

Authors

  • Patrik Smeds

Code

__author__ = "Patrik Smeds"
__copyright__ = "Copyright 2018, Patrik Smeds"
__email__ = "patrik.smeds@gmail.com"
__license__ = "MIT"


from snakemake.shell import shell

shell.executable("bash")

log = snakemake.log_fmt_shell(stdout=False, stderr=True)

extra_params = snakemake.params.get("extra", "")

bam_input = snakemake.input[0]

if not isinstance(bam_input, str) and len(snakemake.input) != 1:
    raise ValueError("Input bam should be one bam file: " + str(bam_input) + "!")

output_bam_file = snakemake.output.bam

if not isinstance(output_bam_file, str) and len(output_bam_file) != 1:
    raise ValueError("Bam output should be one bam file: " + str(output_bam_file) + "!")

output_histo_file = snakemake.output.hist

if not isinstance(output_histo_file, str) and len(output_histo_file) != 1:
    raise ValueError(
        "Histo output should be one histogram file path: "
        + str(output_histo_file)
        + "!"
    )

shell(
    "fgbio GroupReadsByUmi"
    " -i {bam_input}"
    " -o {output_bam_file}"
    " -f {output_histo_file}"
    " {extra_params}"
    " {log}"
)