PRE.PY
Preprocessing/normalisation of vcf/bcf files. Part of the hap.py suite by Illumina (see https://github.com/Illumina/hap.py/blob/master/doc/normalisation.md).
Example
This wrapper can be used in the following way:
rule preprocess_variants:
input:
##vcf/bcf
variants="variants.vcf",
output:
"normalized/variants.vcf.gz",
log:
"log/pre.log",
params:
## path to reference genome
genome="genome.fasta",
## parameters such as -L to left-align variants
extra="-L",
threads: 2
wrapper:
"v4.6.0/bio/hap.py/pre.py"
Note that input, output and log file paths can be chosen freely.
When running with
snakemake --use-conda
the software dependencies will be automatically deployed into an isolated environment before execution.
Software dependencies
hap.py=0.3.15
Code
__author__ = "Jan Forster"
__copyright__ = "Copyright 2019, Jan Forster"
__email__ = "j.forster@dkfz.de"
__license__ = "MIT"
from os import path
from snakemake.shell import shell
## Extract arguments
extra = snakemake.params.get("extra", "")
log = snakemake.log_fmt_shell(stdout=False, stderr=True)
shell(
"(pre.py"
" --threads {snakemake.threads}"
" -r {snakemake.params.genome}"
" {extra}"
" {snakemake.input.variants}"
" {snakemake.output})"
" {log}"
)