DEEPVARIANT¶
Call genetic variants using deep neural network. Copyright 2017 Google LLC. BSD 3-Clause “New” or “Revised” https://github.com/google/deepvariant
Example¶
This wrapper can be used in the following way:
rule deepvariant:
input:
bam="mapped/{sample}.bam",
ref="genome/genome.fasta"
output:
vcf="calls/{sample}.vcf.gz"
params:
model="wgs", # {wgs, wes, pacbio, hybrid}
sample_name=lambda w: w.sample, # optional
extra=""
threads: 2
log:
"logs/deepvariant/{sample}/stdout.log"
wrapper:
"v2.6.0/bio/deepvariant"
rule deepvariant_gvcf:
input:
bam="mapped/{sample}.bam",
ref="genome/genome.fasta"
output:
vcf="gvcf_calls/{sample}.vcf.gz",
gvcf="gvcf_calls/{sample}.g.vcf.gz"
params:
model="wgs", # {wgs, wes, pacbio, hybrid}
extra=""
threads: 2
log:
"logs/deepvariant/{sample}/stdout.log"
wrapper:
"v2.6.0/bio/deepvariant"
Note that input, output and log file paths can be chosen freely.
When running with
snakemake --use-conda
the software dependencies will be automatically deployed into an isolated environment before execution.
Notes¶
- The extra param alllows for additional program arguments.
- This snakemake wrapper uses bioconda deepvariant package. Copyright 2018 Brad Chapman.
Software dependencies¶
deepvariant=1.4
numpy=1.23
Authors¶
- Tetsuro Hisayoshi
- Nikos Tsardakas Renhuldt
Code¶
__author__ = "Tetsuro Hisayoshi"
__copyright__ = "Copyright 2020, Tetsuro Hisayoshi"
__email__ = "hisayoshi0530@gmail.com"
__license__ = "MIT"
import os
import tempfile
from snakemake.shell import shell
log = snakemake.log_fmt_shell(stdout=True, stderr=True)
extra = snakemake.params.get("extra", "")
log_dir = os.path.dirname(snakemake.log[0])
output_dir = os.path.dirname(snakemake.output[0])
# sample name defaults to basename
sample_name = snakemake.params.get(
"sample_name", os.path.splitext(os.path.basename(snakemake.input.bam))[0]
)
make_examples_gvcf = postprocess_gvcf = ""
gvcf = snakemake.output.get("gvcf", None)
if gvcf:
make_examples_gvcf = "--gvcf {tmp_dir} "
postprocess_gvcf = (
"--gvcf_infile {tmp_dir}/{sample_name}.gvcf.tfrecord@{snakemake.threads}.gz "
"--gvcf_outfile {snakemake.output.gvcf} "
)
with tempfile.TemporaryDirectory() as tmp_dir:
shell(
"(dv_make_examples.py "
"--cores {snakemake.threads} "
"--ref {snakemake.input.ref} "
"--reads {snakemake.input.bam} "
"--sample {sample_name} "
"--examples {tmp_dir} "
"--logdir {log_dir} " + make_examples_gvcf + "{extra} \n"
"dv_call_variants.py "
"--cores {snakemake.threads} "
"--outfile {tmp_dir}/{sample_name}.tmp "
"--sample {sample_name} "
"--examples {tmp_dir} "
"--model {snakemake.params.model} \n"
"dv_postprocess_variants.py "
"--ref {snakemake.input.ref} "
+ postprocess_gvcf
+ "--infile {tmp_dir}/{sample_name}.tmp "
"--outfile {snakemake.output.vcf} ) {log}"
)