MEHARI DOWNLOAD TRANSCRIPT DB

https://img.shields.io/badge/wrapper_version-v9.4.0-10785b https://img.shields.io/github/issues-pr/snakemake/snakemake-wrappers/bio/mehari/download-transcript-db?label=version%20update%20pull%20requests&color=1cb481

Download mehari transcript database

URL: https://github.com/varfish-org/mehari

Example

This wrapper can be used in the following way:

rule download_mehari_transcript_db:
    output:
        "resources/mehari/dbs/transcripts.bin.zst",
    params:
        version="0.10.3",  # check https://github.com/varfish-org/mehari-data-tx/releases for available versions
        build="GRCh38",  # GRCh37 or GRCh38
        source="ensembl",  # ensembl, refseq or ensembl-and-refseq
    log:
        "logs/mehari/download_mehari_transcript_db.log",
    cache: "omit-software"  # save space and time with between workflow caching (see docs)
    wrapper:
        "v9.4.0/bio/mehari/download-transcript-db"

Note that input, output and log file paths can be chosen freely.

When running with

snakemake --use-conda

the software dependencies will be automatically deployed into an isolated environment before execution.

Software dependencies

  • curl=8.19.0

Params

  • version: Version of the transcript DB, from available versions

  • build: GRCh37 or GRCh38

  • source: ensembl, refseq or ensembl-and-refseq

Authors

  • Till Hartmann

Code

__author__ = "Till Hartmann"
__copyright__ = "Copyright 2025, Till Hartmann"
__email__ = "till.hartmann@bih-charite.de"
__license__ = "MIT"

from snakemake.shell import shell

import re

version_re = re.compile(r"\d+\.\d+\.\d+")

log = snakemake.log_fmt_shell(stdout=True, stderr=True)

version = snakemake.params.get("version", "")
if not version_re.fullmatch(version):
    raise ValueError("version must have format MAJOR.MINOR.PATCH")

build = snakemake.params.get("build", "").lower()
if build not in {"grch37", "grch38"}:
    raise ValueError("build must be 'GRCh37' or 'GRCh38'")
build = {"grch37": "GRCh37", "grch38": "GRCh38"}[build]

source = snakemake.params.get("source", "").lower()
if source not in {"ensembl", "refseq", "ensembl-and-refseq"}:
    raise ValueError("source must be 'ensembl', 'refseq' or 'ensembl-and-refseq'")

shell(
    "curl --fail --silent --location https://github.com/varfish-org/mehari-data-tx/releases/download/v{version}/mehari-data-txs-{build}-{source}-{version}.bin.zst -o {snakemake.output[0]:q} {log}"
)