VARDICT

Run Vardict to call genomic variants

URL:

Example

This wrapper can be used in the following way:

rule vardict_single_mode:
    input:
        reference="data/genome.fasta",
        regions="regions.bed",
        bam="mapped/{sample}.bam",
    output:
        vcf="vcf/{sample}.s.vcf",
    params:
        extra="",
        bed_columns="-c 1 -S 2 -E 3 -g 4",  # Optional, default is -c 1 -S 2 -E 3 -g 4
        allele_frequency_threshold="0.01",  # Optional, default is 0.01
    threads: 1
    log:
        "logs/varscan_{sample}_s_.log",
    wrapper:
        "v1.1.0/bio/vardict"


rule vardict_paired_mode:
    input:
        reference="data/genome.fasta",
        regions="regions.bed",
        bam="mapped/{sample}.bam",
        normal="mapped/b.bam",
    output:
        vcf="vcf/{sample}.tn.vcf",
    params:
        extra="",
    threads: 1
    log:
        "logs/varscan_{sample}_tn.log",
    wrapper:
        "v1.1.0/bio/vardict"

Note that input, output and log file paths can be chosen freely.

When running with

snakemake --use-conda

the software dependencies will be automatically deployed into an isolated environment before execution.

Software dependencies

  • vardict-java==1.8.2

Input/Output

Input:

  • reference file
  • bam file
  • normal file, optional (must be set for tumor/normal mode)
  • region file

Output:

  • A VCF file

Params

  • extra. optional:
  • bed_columns, optional, default -c 1 -S 2 -E 3 -g 4:
  • ah_th optional, default values is 0.01:

Authors

  • Patrik Smeds

Code

"""Snakemake wrapper for VarDict Single sample mode"""

__author__ = "Patrik Smeds"
__copyright__ = "Copyright 2021, Patrik Smeds"
__email__ = "patrik.smeds@scilifelab.uu.se"
__license__ = "MIT"

from pathlib import Path
from snakemake.shell import shell

log = snakemake.log_fmt_shell(stdout=False, stderr=True)

reference = snakemake.input.reference
regions = snakemake.input.regions
bam = snakemake.input.bam
normal = snakemake.input.get("normal", None)
vcf = snakemake.output.vcf

extra = snakemake.params.get("extra", "")
bed_columns = snakemake.params.get("bed_columns", "-c 1 -S 2 -E 3 -g 4")
af_th = snakemake.params.get("allele_frequency_threshold", "0.01")


if normal is None:
    input_bams = bam
    name = snakemake.params.get("sample_name", Path(bam).stem)
    post_scripts = (
        "teststrandbias.R | var2vcf_valid.pl -A -N '" + name + "' -E -f " + af_th
    )
else:
    input_bams = "'" + bam + "|" + normal + "'"
    name = snakemake.params.get("sample_name", Path(bam).stem + "|" + Path(normal).stem)
    post_scripts = 'testsomatic.R | var2vcf_paired.pl -N "' + name + '" -f ' + af_th


shell(
    "vardict-java -G {reference} "
    "-f {af_th} "
    " {extra} "
    "-th {snakemake.threads} "
    "{bed_columns} "
    "-N '{name}' "
    "-b {input_bams} "
    "{regions} |"
    "{post_scripts} "
    "> {vcf}"
    "{log}"
)