DEEPVARIANT
Call genetic variants using deep neural network. Copyright 2017 Google LLC. BSD 3-Clause “New” or “Revised” https://github.com/google/deepvariant
Example
This wrapper can be used in the following way:
rule deepvariant:
input:
bam="mapped/{sample}.bam",
ref="genome/genome.fasta"
output:
vcf="calls/{sample}.vcf.gz"
params:
model="wgs", # {wgs, wes, pacbio, hybrid}
sample_name=lambda w: w.sample, # optional
extra=""
threads: 2
log:
"logs/deepvariant/{sample}/stdout.log"
wrapper:
"v3.9.0-1-gc294552/bio/deepvariant"
rule deepvariant_gvcf:
input:
bam="mapped/{sample}.bam",
ref="genome/genome.fasta"
output:
vcf="gvcf_calls/{sample}.vcf.gz",
gvcf="gvcf_calls/{sample}.g.vcf.gz"
params:
model="wgs", # {wgs, wes, pacbio, hybrid}
extra=""
threads: 2
log:
"logs/deepvariant/{sample}/stdout.log"
wrapper:
"v3.9.0-1-gc294552/bio/deepvariant"
Note that input, output and log file paths can be chosen freely.
When running with
snakemake --use-conda
the software dependencies will be automatically deployed into an isolated environment before execution.
Notes
The extra param alllows for additional program arguments.
This snakemake wrapper uses bioconda deepvariant package. Copyright 2018 Brad Chapman.
Software dependencies
deepvariant=1.4
numpy=1.23
Input/Output
Input:
fasta
bam
Output:
vcf
visual report html
Code
__author__ = "Tetsuro Hisayoshi"
__copyright__ = "Copyright 2020, Tetsuro Hisayoshi"
__email__ = "hisayoshi0530@gmail.com"
__license__ = "MIT"
import os
import tempfile
from snakemake.shell import shell
log = snakemake.log_fmt_shell(stdout=True, stderr=True)
extra = snakemake.params.get("extra", "")
log_dir = os.path.dirname(snakemake.log[0])
output_dir = os.path.dirname(snakemake.output[0])
# sample name defaults to basename
sample_name = snakemake.params.get(
"sample_name", os.path.splitext(os.path.basename(snakemake.input.bam))[0]
)
make_examples_gvcf = postprocess_gvcf = ""
gvcf = snakemake.output.get("gvcf", None)
if gvcf:
make_examples_gvcf = "--gvcf {tmp_dir} "
postprocess_gvcf = (
"--gvcf_infile {tmp_dir}/{sample_name}.gvcf.tfrecord@{snakemake.threads}.gz "
"--gvcf_outfile {snakemake.output.gvcf} "
)
with tempfile.TemporaryDirectory() as tmp_dir:
shell(
"(dv_make_examples.py "
"--cores {snakemake.threads} "
"--ref {snakemake.input.ref} "
"--reads {snakemake.input.bam} "
"--sample {sample_name} "
"--examples {tmp_dir} "
"--logdir {log_dir} " + make_examples_gvcf + "{extra} \n"
"dv_call_variants.py "
"--cores {snakemake.threads} "
"--outfile {tmp_dir}/{sample_name}.tmp "
"--sample {sample_name} "
"--examples {tmp_dir} "
"--model {snakemake.params.model} \n"
"dv_postprocess_variants.py "
"--ref {snakemake.input.ref} "
+ postprocess_gvcf
+ "--infile {tmp_dir}/{sample_name}.tmp "
"--outfile {snakemake.output.vcf} ) {log}"
)