VARSCAN MPILEUP2SNP

https://img.shields.io/github/issues-pr/snakemake/snakemake-wrappers/bio/varscan/mpileup2snp?label=version%20update%20pull%20requests

Detect variants in NGS data from Samtools mpileup with VarScan

Example

This wrapper can be used in the following way:

rule mpileup_to_vcf:
    input:
        "mpileup/{sample}.mpileup.gz"
    output:
        "vcf/{sample}.vcf"
    message:
        "Calling SNP with Varscan2"
    threads:  # Varscan does not take any threading information
        1     # However, mpileup might have to be unzipped.
              # Keep threading value to one for unzipped mpileup input
              # Set it to two for zipped mipileup files
    # optional specification of memory usage of the JVM that snakemake will respect with global
    # resource restrictions (https://snakemake.readthedocs.io/en/latest/snakefiles/rules.html#resources)
    # and which can be used to request RAM during cluster job submission as `{resources.mem_mb}`:
    # https://snakemake.readthedocs.io/en/latest/executing/cluster.html#job-properties
    resources:
        mem_mb=1024
    log:
        "logs/varscan_{sample}.log"
    wrapper:
        "v3.8.0/bio/varscan/mpileup2snp"

Note that input, output and log file paths can be chosen freely.

When running with

snakemake --use-conda

the software dependencies will be automatically deployed into an isolated environment before execution.

Notes

Varscan does not take any threading information by itself. However, mpileup files given as input, might be gzipped.

If so, it’s recommended to use two threads:

  • 1 for varscan itself

  • 1 for zcat

Software dependencies

  • varscan=2.4.6

  • snakemake-wrapper-utils=0.6.2

Input/Output

Input:

  • A mpileup file

Output:

  • A VCF file

Authors

  • Thibault Dayris

Code

"""Snakemake wrapper for Varscan2 mpileup2snp"""

__author__ = "Thibault Dayris"
__copyright__ = "Copyright 2019, Dayris Thibault"
__email__ = "thibault.dayris@gustaveroussy.fr"
__license__ = "MIT"

import os.path as op
from snakemake.shell import shell
from snakemake.utils import makedirs
from snakemake_wrapper_utils.java import get_java_opts

# Gathering extra parameters and logging behaviour
log = snakemake.log_fmt_shell(stdout=False, stderr=True)
extra = snakemake.params.get("extra", "")
java_opts = get_java_opts(snakemake)

# In case input files are gzipped mpileup files,
# they are being unzipped and piped
# In that case, it is recommended to use at least 2 threads:
# - One for unzipping with zcat
# - One for running varscan
pileup = (
    " cat {} ".format(snakemake.input[0])
    if not snakemake.input[0].endswith("gz")
    else " zcat {} ".format(snakemake.input[0])
)

# Building output directories
makedirs(op.dirname(snakemake.output[0]))

shell(
    "varscan mpileup2snp "  # Tool and its subprocess
    "<( {pileup} ) "
    "{java_opts} {extra} "  # Extra parameters
    "> {snakemake.output[0]} "  # Path to vcf file
    "{log}"  # Logging behaviour
)