MOSDEPTH

https://img.shields.io/badge/wrapper_version-v9.8.0-10785b https://img.shields.io/github/issues-pr/snakemake/snakemake-wrappers/bio/mosdepth?label=version%20update%20pull%20requests&color=1cb481

fast BAM/CRAM depth calculation

Example

This wrapper can be used in the following way:

rule mosdepth:
    input:
        bam="aligned/{dataset}.bam",
        bai="aligned/{dataset}.bam.bai",
    output:
        dist="mosdepth/{dataset}.mosdepth.global.dist.txt",
        per_base="mosdepth/{dataset}.per-base.bed.gz",
        summary="mosdepth/{dataset}.mosdepth.summary.txt",
    log:
        "logs/mosdepth/{dataset}.log",
    params:
        extra="--fast-mode",
    threads: 4
    wrapper:
        "v9.8.0/bio/mosdepth"


rule mosdepth_bed:
    input:
        bam="aligned/{dataset}.bam",
        bai="aligned/{dataset}.bam.bai",
        bed="test.bed",
    output:
        dist="mosdepth_bed/{dataset}.mosdepth.global.dist.txt",
        bed_dist="mosdepth_bed/{dataset}.mosdepth.region.dist.txt",
        bed="mosdepth_bed/{dataset}.regions.bed.gz",
        summary="mosdepth_bed/{dataset}.mosdepth.summary.txt",
    log:
        "logs/mosdepth_bed/{dataset}.log",
    params:
        extra="--use-median",
    threads: 4
    wrapper:
        "v9.8.0/bio/mosdepth"


rule mosdepth_by_threshold:
    input:
        bam="aligned/{dataset}.bam",
        bai="aligned/{dataset}.bam.bai",
    output:
        dist="mosdepth_by_threshold/{dataset}.mosdepth.global.dist.txt",
        bed_dist="mosdepth_by_threshold/{dataset}.mosdepth.region.dist.txt",
        bed="mosdepth_by_threshold/{dataset}.regions.bed.gz",
        thres="mosdepth_by_threshold/{dataset}.thresholds.bed.gz",
        summary="mosdepth_by_threshold/{dataset}.mosdepth.summary.txt",
    log:
        "logs/mosdepth_by/{dataset}.log",
    params:
        by="500",  # needed for mosdepth.region.dist.txt and regions.bed.gz
        thresholds="1,5,10,30",  # needed for thresholds.bed.gz
    threads: 4
    wrapper:
        "v9.8.0/bio/mosdepth"


rule mosdepth_quantize_precision:
    input:
        bam="aligned/{dataset}.bam",
        bai="aligned/{dataset}.bam.bai",
    output:
        dist="mosdepth_quantize_precision/{dataset}.mosdepth.global.dist.txt",
        quant="mosdepth_quantize_precision/{dataset}.quantized.bed.gz",
        summary="mosdepth_quantize_precision/{dataset}.mosdepth.summary.txt",
    log:
        "logs/mosdepth_quantize_precision/{dataset}.log",
    params:
        quantize="0:1:5:150",  # needed for quantized.bed.gz
        precision="5",  # optional, set decimals of precision
    threads: 4
    wrapper:
        "v9.8.0/bio/mosdepth"


rule mosdepth_cram:
    input:
        bam="aligned/{dataset}.cram",
        bai="aligned/{dataset}.cram.crai",
        bed="test.bed",
        fasta="genome.fasta",
    output:
        dist="mosdepth_cram/{dataset}.mosdepth.global.dist.txt",
        bed_dist="mosdepth_cram/{dataset}.mosdepth.region.dist.txt",
        bed="mosdepth_cram/{dataset}.regions.bed.gz",
        summary="mosdepth_cram/{dataset}.mosdepth.summary.txt",
    log:
        "logs/mosdepth_cram/{dataset}.log",
    params:
        extra="--use-median",
    threads: 4
    wrapper:
        "v9.8.0/bio/mosdepth"

Note that input, output and log file paths can be chosen freely.

When running with

snakemake --use-conda

the software dependencies will be automatically deployed into an isolated environment before execution.

Notes

  • The by param allows to specify (integer) window-sizes (incompatible with input BED).

  • The threshold param allows to, for or each interval in –by, write number of bases covered by at least threshold bases. Specify multiple integer values separated by ‘,’.

  • The precision param allows to specify output floating point precision.

  • The extra param allows for additional program arguments.

  • For more information see, https://github.com/brentp/mosdepth

Software dependencies

  • mosdepth=0.3.14

Input/Output

Input:

  • BAM/CRAM files

  • reference genome (optional)

  • BED file (optional)

Output:

  • Several coverage summary files.

Authors

  • William Rowell

  • David Lähnemann

  • Filipe Vieira

Code

__author__ = "William Rowell"
__copyright__ = "Copyright 2020, William Rowell"
__email__ = "wrowell@pacb.com"
__license__ = "MIT"


import tempfile
from snakemake.shell import shell

extra = snakemake.params.get("extra", "")
log = snakemake.log_fmt_shell(stdout=True, stderr=True, append=True)


if not snakemake.output.get("per_base"):
    extra += " --no-per-base"


if snakemake.output.get("quant"):
    extra += f" --quantize {snakemake.params.quantize}"


if snakemake.output.get("thres"):
    extra += f" --thresholds {snakemake.params.thresholds}"


if snakemake.output.get("bed_dist") or snakemake.output.get("bed"):
    by = snakemake.input.get("bed", snakemake.params.get("by", False))
    if by:
        extra += f" --by {by}"


precision = snakemake.params.get("precision", "")
if precision:
    precision = f"MOSDEPTH_PRECISION={precision}"


fasta = snakemake.input.get("fasta", "")
if fasta:
    fasta = f"--fasta {fasta}"


# mosdepth uses additional threads for decompression, passed to the '--threads' argument
threads = "" if snakemake.threads <= 1 else "--threads {}".format(snakemake.threads - 1)


with tempfile.TemporaryDirectory() as tmpdir:
    shell(
        "{precision} mosdepth {threads} {fasta} {extra} {tmpdir}/temp {snakemake.input.bam} {log}"
    )

    if snakemake.output.get("summary"):
        shell(
            "mv --verbose {tmpdir}/temp.mosdepth.summary.txt {snakemake.output.summary} {log}"
        )

    if snakemake.output.get("per_base"):
        assert snakemake.output.per_base.endswith("gz")
        shell(
            "mv --verbose {tmpdir}/temp.per-base.bed.gz {snakemake.output.per_base} {log}"
        )

    if snakemake.output.get("quant"):
        assert snakemake.output.quant.endswith("gz")
        shell(
            "mv --verbose {tmpdir}/temp.quantized.bed.gz {snakemake.output.quant} {log}"
        )

    if snakemake.output.get("thres"):
        assert snakemake.output.thres.endswith("gz")
        shell(
            "mv --verbose {tmpdir}/temp.thresholds.bed.gz {snakemake.output.thres} {log}"
        )

    if snakemake.output.get("dist"):
        shell(
            "mv --verbose {tmpdir}/temp.mosdepth.global.dist.txt {snakemake.output.dist} {log}"
        )

    if snakemake.output.get("bed_dist"):
        shell(
            "mv --verbose {tmpdir}/temp.mosdepth.region.dist.txt {snakemake.output.bed_dist} {log}"
        )

    if snakemake.output.get("bed"):
        assert snakemake.output.bed.endswith("gz")
        shell("mv --verbose {tmpdir}/temp.regions.bed.gz {snakemake.output.bed} {log}")