READ_DUPLICATION

https://img.shields.io/github/issues-pr/snakemake/snakemake-wrappers/bio/rseqc/read_duplication?label=version%20update%20pull%20requests

Estimate read duplication rate

URL: https://rseqc.sourceforge.net/#read-duplication-py

Example

This wrapper can be used in the following way:

rule test_rseqc_read_duplication:
    input:
        "A.bam",
    output:
        pos="a.dup.pos.xls",
        seq="a.dup.seq.xls",
        plot_r="script.a_plot.R",
        pdf="a.pdf",
    log:
        "rseqc.log",
    params:
        extra="-q 10",
    wrapper:
        "v3.9.0-14-g476823b/bio/rseqc/read_duplication"

Note that input, output and log file paths can be chosen freely.

When running with

snakemake --use-conda

the software dependencies will be automatically deployed into an isolated environment before execution.

Software dependencies

  • rseqc=5.0.3

Input/Output

Input:

  • Path to SAM/BAM file

Output:

  • pos: Optional path to read duplication rate determined from mapping position of read

  • seq: Optional path to read duplication rate determined from sequence of read

  • plot_r: Optional path to R script

  • pdf: Optional path to pdf-formatted graph

Params

  • extra: Optional parameters for read_duplication.py besides -i and -o.

Authors

  • Thibault Dayris

Code

# coding: utf-8

"""Snakemake wrapper for RSeQC read_duplication.py"""

__author__ = "Thibault Dayris"
__mail__ = "thibault.dayris@gustaveroussy.fr"
__copyright__ = "Copyright 2024, Thibault Dayris"
__license__ = "MIT"

from tempfile import TemporaryDirectory
from snakemake import shell

extra = snakemake.params.get("extra", "")
log = snakemake.log_fmt_shell(stdout=True, stderr=True, append=True)


with TemporaryDirectory() as tempdir:
    shell(
        "read_duplication.py {extra} "
        "--input-file {snakemake.input} "
        "--out-prefix {tempdir}/out "
        "{log} "
    )

    if "pos" in snakemake.output.keys():
        shell(
            "mv --verbose "
            "{tempdir}/out.pos.DupRate.xls "
            "{snakemake.output.pos} "
            "{log} "
        )

    if "seq" in snakemake.output.keys():
        shell(
            "mv --verbose "
            "{tempdir}/out.seq.DupRate.xls "
            "{snakemake.output.seq} "
            "{log} "
        )

    if "plot_r" in snakemake.output.keys():
        shell(
            "mv --verbose "
            "{tempdir}/out.DupRate_plot.r "
            "{snakemake.output.plot_r} "
            "{log} "
        )

    if "pdf" in snakemake.output.keys():
        shell(
            "mv --verbose "
            "{tempdir}/out.DupRate_plot.pdf "
            "{snakemake.output.pdf} "
            "{log} "
        )