VARSCAN MPILEUP2INDEL#

https://img.shields.io/github/issues-pr/snakemake/snakemake-wrappers/bio/varscan/mpileup2indel?label=version%20update%20pull%20requests

Detect indel in NGS data from mpileup files with VarScan

Example#

This wrapper can be used in the following way:

rule mpileup_to_vcf:
    input:
        "mpileup/{sample}.mpileup.gz"
    output:
        "vcf/{sample}.vcf"
    message:
        "Calling Indel with Varscan2"
    threads:  # Varscan does not take any threading information
        1     # However, mpileup might have to be unzipped.
              # Keep threading value to one for unzipped mpileup input
              # Set it to two for zipped mipileup files
    # optional specification of memory usage of the JVM that snakemake will respect with global
    # resource restrictions (https://snakemake.readthedocs.io/en/latest/snakefiles/rules.html#resources)
    # and which can be used to request RAM during cluster job submission as `{resources.mem_mb}`:
    # https://snakemake.readthedocs.io/en/latest/executing/cluster.html#job-properties
    resources:
        mem_mb=1024
    log:
        "logs/varscan_{sample}.log"
    wrapper:
        "v3.0.1/bio/varscan/mpileup2indel"

Note that input, output and log file paths can be chosen freely.

When running with

snakemake --use-conda

the software dependencies will be automatically deployed into an isolated environment before execution.

Notes#

Varscan does not take any threading information by itself. However, mpileup files given as input, might be gzipped.

If so, it’s recommended to use two threads:

  • 1 for varscan itself

  • 1 for zcat

Software dependencies#

  • varscan=2.4.6

  • snakemake-wrapper-utils=0.6.2

Input/Output#

Input:

  • A mpileup file

Output:

  • A VCF file

Authors#

  • Thibault Dayris

Code#

"""Snakemake wrapper for Varscan2 mpileup2indel"""

__author__ = "Thibault Dayris"
__copyright__ = "Copyright 2019, Dayris Thibault"
__email__ = "thibault.dayris@gustaveroussy.fr"
__license__ = "MIT"

import os.path as op
from snakemake.shell import shell
from snakemake.utils import makedirs
from snakemake_wrapper_utils.java import get_java_opts

# Gathering extra parameters and logging behaviour
log = snakemake.log_fmt_shell(stdout=False, stderr=True)
extra = snakemake.params.get("extra", "")
java_opts = get_java_opts(snakemake)

# In case input files are gzipped mpileup files,
# they are being unzipped and piped
# In that case, it is recommended to use at least 2 threads:
# - One for unzipping with zcat
# - One for running varscan
pileup = (
    " cat {} ".format(snakemake.input[0])
    if not snakemake.input[0].endswith("gz")
    else " zcat {} ".format(snakemake.input[0])
)

# Building output directories
makedirs(op.dirname(snakemake.output[0]))

shell(
    "varscan mpileup2indel "  # Tool and its subprocess
    "<( {pileup} ) "
    "{java_opts} {extra} "  # Extra parameters
    "> {snakemake.output[0]} "  # Path to vcf file
    "{log}"  # Logging behaviour
)