PURGE_DUPS PBCSTAT

https://img.shields.io/github/issues-pr/snakemake/snakemake-wrappers/bio/purge_dups/pbcstat?label=version%20update%20pull%20requests

Purge haplotigs and overlaps in an assembly based on read depth

URL: https://github.com/dfguan/purge_dups

Example

This wrapper can be used in the following way:

rule purge_dups_pbcstat:
    input:
        paf="HiFi_dataset_01.paf.gz",
    output:
        cov="out/pbcstat.cov",
        stat="out/pbcstat.stat",
    log:
        "logs/pbcstat.log",
    params:
        extra="",
    threads: 1
    wrapper:
        "v3.9.0-1-gc294552/bio/purge_dups/pbcstat"

Note that input, output and log file paths can be chosen freely.

When running with

snakemake --use-conda

the software dependencies will be automatically deployed into an isolated environment before execution.

Notes

  • The extra param allows for additional program arguments.

Software dependencies

  • purge_dups=1.2.6

Input/Output

Input:

  • mapped reads in PAF format

Output:

  • coverage

  • stats

Authors

  • Filipe Vieira

Code

__author__ = "Filipe G. Vieira"
__copyright__ = "Copyright 2022, Filipe G. Vieira"
__license__ = "MIT"


import tempfile
from snakemake.shell import shell


extra = snakemake.params.get("extra", "")
log = snakemake.log_fmt_shell(stdout=True, stderr=True)


with tempfile.TemporaryDirectory() as tmpdir:
    shell("pbcstat {extra} -O {tmpdir} {snakemake.input} {log}")

    if snakemake.output.get("cov"):
        shell("cat {tmpdir}/PB.base.cov > {snakemake.output.cov}")
    if snakemake.output.get("stat"):
        shell("cat {tmpdir}/PB.stat > {snakemake.output.stat}")