SAMTOOLS MARKDUP

https://img.shields.io/github/issues-pr/snakemake/snakemake-wrappers/bio/samtools/markdup?label=version%20update%20pull%20requests

Mark duplicate alignments in a coordinate sorted file .

URL: http://www.htslib.org/doc/samtools-markdup.html

Example

This wrapper can be used in the following way:

rule samtools_markdup:
    input:
        aln="{sample}.bam",
    output:
        bam="{sample}.markdup.bam",
        idx="{sample}.markdup.bam.csi",
    log:
        "{sample}.markdup.log",
    params:
        extra="-c --no-PG",
    threads: 2
    wrapper:
        "v7.6.0/bio/samtools/markdup"

Note that input, output and log file paths can be chosen freely.

When running with

snakemake --use-conda

the software dependencies will be automatically deployed into an isolated environment before execution.

Software dependencies

  • samtools=1.22.1

  • snakemake-wrapper-utils=0.8.0

Input/Output

Input:

  • SAM/BAM/CRAM file

Output:

  • SAM/BAM/CRAM file

  • SAM/BAM/CRAM index file (optional)

Params

  • extra: additional program arguments (not -@/–threads, –write-index, -m, -T, -f, -o or -O/–output-fmt).

Authors

  • Filipe G. Vieira

Code

__author__ = "Filipe G. Vieira"
__copyright__ = "Copyright 2024, Filipe G. Vieira"
__license__ = "MIT"


import tempfile
from pathlib import Path
from snakemake.shell import shell
from snakemake_wrapper_utils.samtools import get_samtools_opts

samtools_opts = get_samtools_opts(snakemake, parse_output=False)
extra = snakemake.params.get("extra", "")
log = snakemake.log_fmt_shell(stdout=False, stderr=True)

metrics = snakemake.output.get("metrics", "")
if metrics:
    metrics = f"-f {metrics}"


with tempfile.TemporaryDirectory() as tmpdir:
    tmp_prefix = Path(tmpdir) / "samtools_markdup"
    shell(
        "samtools markdup {samtools_opts} {extra} -T {tmp_prefix} {metrics} {snakemake.input[0]} {snakemake.output[0]} {log}"
    )