MOSDEPTH
fast BAM/CRAM depth calculation
Example
This wrapper can be used in the following way:
rule mosdepth:
input:
bam="aligned/{dataset}.bam",
bai="aligned/{dataset}.bam.bai",
output:
dist="mosdepth/{dataset}.mosdepth.global.dist.txt",
per_base="mosdepth/{dataset}.per-base.bed.gz",
summary="mosdepth/{dataset}.mosdepth.summary.txt",
log:
"logs/mosdepth/{dataset}.log",
params:
extra="--fast-mode",
threads: 4
wrapper:
"v9.8.0/bio/mosdepth"
rule mosdepth_bed:
input:
bam="aligned/{dataset}.bam",
bai="aligned/{dataset}.bam.bai",
bed="test.bed",
output:
dist="mosdepth_bed/{dataset}.mosdepth.global.dist.txt",
bed_dist="mosdepth_bed/{dataset}.mosdepth.region.dist.txt",
bed="mosdepth_bed/{dataset}.regions.bed.gz",
summary="mosdepth_bed/{dataset}.mosdepth.summary.txt",
log:
"logs/mosdepth_bed/{dataset}.log",
params:
extra="--use-median",
threads: 4
wrapper:
"v9.8.0/bio/mosdepth"
rule mosdepth_by_threshold:
input:
bam="aligned/{dataset}.bam",
bai="aligned/{dataset}.bam.bai",
output:
dist="mosdepth_by_threshold/{dataset}.mosdepth.global.dist.txt",
bed_dist="mosdepth_by_threshold/{dataset}.mosdepth.region.dist.txt",
bed="mosdepth_by_threshold/{dataset}.regions.bed.gz",
thres="mosdepth_by_threshold/{dataset}.thresholds.bed.gz",
summary="mosdepth_by_threshold/{dataset}.mosdepth.summary.txt",
log:
"logs/mosdepth_by/{dataset}.log",
params:
by="500", # needed for mosdepth.region.dist.txt and regions.bed.gz
thresholds="1,5,10,30", # needed for thresholds.bed.gz
threads: 4
wrapper:
"v9.8.0/bio/mosdepth"
rule mosdepth_quantize_precision:
input:
bam="aligned/{dataset}.bam",
bai="aligned/{dataset}.bam.bai",
output:
dist="mosdepth_quantize_precision/{dataset}.mosdepth.global.dist.txt",
quant="mosdepth_quantize_precision/{dataset}.quantized.bed.gz",
summary="mosdepth_quantize_precision/{dataset}.mosdepth.summary.txt",
log:
"logs/mosdepth_quantize_precision/{dataset}.log",
params:
quantize="0:1:5:150", # needed for quantized.bed.gz
precision="5", # optional, set decimals of precision
threads: 4
wrapper:
"v9.8.0/bio/mosdepth"
rule mosdepth_cram:
input:
bam="aligned/{dataset}.cram",
bai="aligned/{dataset}.cram.crai",
bed="test.bed",
fasta="genome.fasta",
output:
dist="mosdepth_cram/{dataset}.mosdepth.global.dist.txt",
bed_dist="mosdepth_cram/{dataset}.mosdepth.region.dist.txt",
bed="mosdepth_cram/{dataset}.regions.bed.gz",
summary="mosdepth_cram/{dataset}.mosdepth.summary.txt",
log:
"logs/mosdepth_cram/{dataset}.log",
params:
extra="--use-median",
threads: 4
wrapper:
"v9.8.0/bio/mosdepth"
Note that input, output and log file paths can be chosen freely.
When running with
snakemake --use-conda
the software dependencies will be automatically deployed into an isolated environment before execution.
Notes
The by param allows to specify (integer) window-sizes (incompatible with input BED).
The threshold param allows to, for or each interval in –by, write number of bases covered by at least threshold bases. Specify multiple integer values separated by ‘,’.
The precision param allows to specify output floating point precision.
The extra param allows for additional program arguments.
For more information see, https://github.com/brentp/mosdepth
Software dependencies
mosdepth=0.3.14
Input/Output
Input:
BAM/CRAM files
reference genome (optional)
BED file (optional)
Output:
Several coverage summary files.
Code
__author__ = "William Rowell"
__copyright__ = "Copyright 2020, William Rowell"
__email__ = "wrowell@pacb.com"
__license__ = "MIT"
import tempfile
from snakemake.shell import shell
extra = snakemake.params.get("extra", "")
log = snakemake.log_fmt_shell(stdout=True, stderr=True, append=True)
if not snakemake.output.get("per_base"):
extra += " --no-per-base"
if snakemake.output.get("quant"):
extra += f" --quantize {snakemake.params.quantize}"
if snakemake.output.get("thres"):
extra += f" --thresholds {snakemake.params.thresholds}"
if snakemake.output.get("bed_dist") or snakemake.output.get("bed"):
by = snakemake.input.get("bed", snakemake.params.get("by", False))
if by:
extra += f" --by {by}"
precision = snakemake.params.get("precision", "")
if precision:
precision = f"MOSDEPTH_PRECISION={precision}"
fasta = snakemake.input.get("fasta", "")
if fasta:
fasta = f"--fasta {fasta}"
# mosdepth uses additional threads for decompression, passed to the '--threads' argument
threads = "" if snakemake.threads <= 1 else "--threads {}".format(snakemake.threads - 1)
with tempfile.TemporaryDirectory() as tmpdir:
shell(
"{precision} mosdepth {threads} {fasta} {extra} {tmpdir}/temp {snakemake.input.bam} {log}"
)
if snakemake.output.get("summary"):
shell(
"mv --verbose {tmpdir}/temp.mosdepth.summary.txt {snakemake.output.summary} {log}"
)
if snakemake.output.get("per_base"):
assert snakemake.output.per_base.endswith("gz")
shell(
"mv --verbose {tmpdir}/temp.per-base.bed.gz {snakemake.output.per_base} {log}"
)
if snakemake.output.get("quant"):
assert snakemake.output.quant.endswith("gz")
shell(
"mv --verbose {tmpdir}/temp.quantized.bed.gz {snakemake.output.quant} {log}"
)
if snakemake.output.get("thres"):
assert snakemake.output.thres.endswith("gz")
shell(
"mv --verbose {tmpdir}/temp.thresholds.bed.gz {snakemake.output.thres} {log}"
)
if snakemake.output.get("dist"):
shell(
"mv --verbose {tmpdir}/temp.mosdepth.global.dist.txt {snakemake.output.dist} {log}"
)
if snakemake.output.get("bed_dist"):
shell(
"mv --verbose {tmpdir}/temp.mosdepth.region.dist.txt {snakemake.output.bed_dist} {log}"
)
if snakemake.output.get("bed"):
assert snakemake.output.bed.endswith("gz")
shell("mv --verbose {tmpdir}/temp.regions.bed.gz {snakemake.output.bed} {log}")