PRESEQ LC_EXTRAP

https://img.shields.io/github/issues-pr/snakemake/snakemake-wrappers/bio/preseq/lc_extrap?label=version%20update%20pull%20requests

preseq estimates the library complexity of existing sequencing data to then estimate the yield of future experiments based on their design.

URL: https://github.com/smithlabcode/preseq

Example

This wrapper can be used in the following way:

rule preseq_lc_extrap_bam:
    input:
        "samples/{sample}.sorted.bam"
    output:
        "test_bam/{sample}.lc_extrap"
    params:
        "-v"   #optional parameters
    log:
        "logs/test_bam/{sample}.log"
    wrapper:
        "v3.6.0-3-gc8272d7/bio/preseq/lc_extrap"

rule preseq_lc_extrap_bed:
    input:
        "samples/{sample}.sorted.bed"
    output:
        "test_bed/{sample}.lc_extrap"
    params:
        "-v"   #optional parameters
    log:
        "logs/test_bed/{sample}.log"
    wrapper:
        "v3.6.0-3-gc8272d7/bio/preseq/lc_extrap"

Note that input, output and log file paths can be chosen freely.

When running with

snakemake --use-conda

the software dependencies will be automatically deployed into an isolated environment before execution.

Software dependencies

  • preseq=3.2.0

Input/Output

Input:

  • bed files containing duplicates and sorted by chromosome, start position, strand position and finally strand OR

  • bam files containing duplicates and sorted by using bamtools or samtools sort.

Output:

  • lc_extrap (.lc_extrap)

Authors

  • Antonie Vietor

Code

__author__ = "Antonie Vietor"
__copyright__ = "Copyright 2020, Antonie Vietor"
__email__ = "antonie.v@gmx.de"
__license__ = "MIT"

import os
from snakemake.shell import shell

log = snakemake.log_fmt_shell(stdout=False, stderr=True)

params = ""
if (os.path.splitext(snakemake.input[0])[-1]) == ".bam":
    if "-bam" not in (snakemake.input[0]):
        params = "-bam "

shell(
    "(preseq lc_extrap {params} {snakemake.params} {snakemake.input[0]} -output {snakemake.output[0]}) {log}"
)