MEHARI ANNOTATE SEQVARS
Annotate variant calls with mehari.
URL: https://github.com/varfish-org/mehari
Example
This wrapper can be used in the following way:
rule mehari_annotate_seqvars_variants_MT:
input:
calls="{prefix}.vcf", # .vcf, .vcf.gz or .bcf
ref="resources/MT.fasta", # has to be uncompressed
fai="resources/MT.fasta.fai",
transcript_db="resources/MT-ND2-GRCh38-ensembl-0.10.3.bin.zst", # transcript database for SO term / consequence annotation
# clinvar_db="resources/clinvar.bin.zst", # clinvar database for clinvar VCV annotation
# frequency_db="resources/frequencies.bin.zst" # frequencies/gnomad database for frequency annotation
output:
calls="{prefix}.annotated.bcf", # .vcf, .vcf.gz or .bcf
params:
extra="",
log:
"logs/mehari/mehari_annotate_variants.{prefix}.log",
wrapper:
"v9.0.1/bio/mehari/annotate-seqvars"
Note that input, output and log file paths can be chosen freely.
When running with
snakemake --use-conda
the software dependencies will be automatically deployed into an isolated environment before execution.
Software dependencies
mehari=0.39.0
Input/Output
Input:
calls
ref
fai
transcript_db
clinvar_db
frequency_db
Output:
calls
Params
extra: Extra arguments for the mehari annotate seqvars invocation.
Code
__author__ = "Till Hartmann"
__copyright__ = "Copyright 2025, Till Hartmann"
__email__ = "till.hartmann@bih-charite.de"
__license__ = "MIT"
from snakemake.shell import shell
import logging
extra = snakemake.params.get("extra", "")
log = snakemake.log_fmt_shell(stdout=False, stderr=True)
transcript_db = snakemake.input.get("transcript_db", "")
if transcript_db:
transcript_db = f"--transcripts {transcript_db}"
clinvar_db = snakemake.input.get("clinvar_db", "")
if clinvar_db:
clinvar_db = f"--clinvar {clinvar_db}"
frequency_db = snakemake.input.get("frequency_db", "")
if frequency_db:
frequency_db = f"--frequency {frequency_db}"
if not transcript_db and not clinvar_db and not frequency_db:
raise ValueError(
"At least one of inputs 'transcript_db', 'clinvar_db' and 'frequency_db' must be specified"
)
ref = snakemake.input.get("ref", "")
if ref:
ref = f"--reference {ref}"
if not snakemake.input.get("fai"):
raise ValueError("Reference FASTA index must be specified")
else:
logging.warning(
"Without reference fasta, cannot do correct HGVS 3' shifting for genomic coordinates."
)
shell(
"(mehari annotate seqvars "
"--path-input-vcf {snakemake.input.calls:q} "
"{transcript_db} "
"{clinvar_db} "
"{frequency_db} "
"{ref} "
"{extra} "
"--path-output-vcf {snakemake.output.calls:q} "
") {log}"
)