DEEPVARIANT

Call genetic variants using deep neural network. Copyright 2017 Google LLC. BSD 3-Clause “New” or “Revised” https://github.com/google/deepvariant

Example

This wrapper can be used in the following way:

rule deepvariant:
    input:
        bam="mapped/{sample}.bam",
        ref="genome/genome.fasta"
    output:
        vcf="calls/{sample}.vcf.gz"
    params:
        model="wgs",   # {wgs, wes}
        extra=""
    threads: 2
    log:
        "logs/deepvariant/{sample}/stdout.log"
    wrapper:
        "0.72.0/bio/deepvariant"

Note that input, output and log file paths can be chosen freely. When running with

snakemake --use-conda

the software dependencies will be automatically deployed into an isolated environment before execution.

Software dependencies

  • deepvariant=0.10.0
  • tensorflow-estimator=2.0.0
  • unzip=6.0

Input/Output

Input:

  • fasta
  • bam

Output:

  • vcf
  • visual report html

Notes

  • The extra param alllows for additional program arguments.
  • This snakemake wrapper uses bioconda deepvariant package. Copyright 2018 Brad Chapman.

Authors

  • Tetsuro Hisayoshi

Code

__author__ = "Tetsuro Hisayoshi"
__copyright__ = "Copyright 2020, Tetsuro Hisayoshi"
__email__ = "hisayoshi0530@gmail.com"
__license__ = "MIT"

import os
import tempfile
from snakemake.shell import shell

log = snakemake.log_fmt_shell(stdout=True, stderr=True)
extra = snakemake.params.get("extra", "")

log_dir = os.path.dirname(snakemake.log[0])
output_dir = os.path.dirname(snakemake.output[0])

# sample basename
basename = os.path.splitext(os.path.basename(snakemake.input.bam[0]))[0]


with tempfile.TemporaryDirectory() as tmp_dir:
    shell(
        "(dv_make_examples.py "
        "--cores {snakemake.threads} "
        "--ref {snakemake.input.ref} "
        "--reads {snakemake.input.bam} "
        "--sample {basename} "
        "--examples {tmp_dir} "
        "--logdir {log_dir} "
        "{extra} \n"
        "dv_call_variants.py "
        "--cores {snakemake.threads} "
        "--outfile {tmp_dir}/{basename}.tmp "
        "--sample {basename} "
        "--examples {tmp_dir} "
        "--model {snakemake.params.model} \n"
        "dv_postprocess_variants.py "
        "--ref {snakemake.input.ref} "
        "--infile {tmp_dir}/{basename}.tmp "
        "--outfile {snakemake.output.vcf} ) {log}"
    )