PURGE_DUPS
Purge haplotigs and overlaps in an assembly based on read depth
URL: https://github.com/dfguan/purge_dups
Example
This wrapper can be used in the following way:
rule purge_dups:
input:
paf="split.self.paf.gz",
#cov="pbcstat.cov",
#cutoff="calcuts.cutoffs",
output:
"out/purge_dups.bed",
log:
"logs/purge_dups.log",
params:
extra="-2",
threads: 1
wrapper:
"v5.8.0-3-g915ba34/bio/purge_dups/purge_dups"
Note that input, output and log file paths can be chosen freely.
When running with
snakemake --use-conda
the software dependencies will be automatically deployed into an isolated environment before execution.
Notes
The extra param allows for additional program arguments.
Software dependencies
purge_dups=1.2.6
Input/Output
Input:
Self-aligned split assembly in PAF format
Output:
BED file
Code
__author__ = "Filipe G. Vieira"
__copyright__ = "Copyright 2022, Filipe G. Vieira"
__license__ = "MIT"
from snakemake.shell import shell
extra = snakemake.params.get("extra", "")
log = snakemake.log_fmt_shell(stdout=False, stderr=True)
cov = snakemake.input.get("cov", "")
if cov:
cov = f"-c {cov}"
cutoff = snakemake.input.get("cutoff", "")
if cutoff:
cutoff = f"-T {cutoff}"
shell(
"purge_dups {cov} {cutoff} {extra} {snakemake.input.paf} > {snakemake.output[0]} {log}"
)