GENOMEPY

https://img.shields.io/github/issues-pr/snakemake/snakemake-wrappers/bio/genomepy?label=version%20update%20pull%20requests

Download genomes the easy way: https://github.com/vanheeringen-lab/genomepy

Example

This wrapper can be used in the following way:

rule genomepy:
    output:
        multiext(
            "{assembly}/{assembly}",
            ".fa",
            ".fa.fai",
            ".fa.sizes",
            ".gaps.bed",
            ".annotation.gtf",
            ".blacklist.bed",
        ),
    log:
        "logs/genomepy_{assembly}.log",
    params:
        provider="ucsc",  # optional, defaults to ucsc. Choose from ucsc, ensembl, and ncbi
    cache: "omit-software"  # mark as eligible for between workflow caching
    wrapper:
        "v3.8.0-49-g6f33607/bio/genomepy"

Note that input, output and log file paths can be chosen freely.

When running with

snakemake --use-conda

the software dependencies will be automatically deployed into an isolated environment before execution.

Software dependencies

  • genomepy=0.16.1

Params

  • provider: which provider to download from, defaults to UCSC (choose from UCSC, Ensembl, NCBI).

Authors

  • Maarten van der Sande

Code

__author__ = "Maarten van der Sande"
__copyright__ = "Copyright 2020, Maarten van der Sande"
__email__ = "M.vanderSande@science.ru.nl"
__license__ = "MIT"


from snakemake.shell import shell

# Optional parameters
provider = snakemake.params.get("provider", "ucsc").lower()

# set options for plugins
all_plugins = "blacklist,bowtie2,bwa,gmap,hisat2,minimap2,star"
req_plugins = ","
if any(["blacklist" in out for out in snakemake.output]):
    req_plugins = "blacklist,"

annotation = ""
if any(["annotation" in out for out in snakemake.output]):
    annotation = "--annotation"

# parse the genome dir
genome_dir = "./"
if snakemake.output[0].count("/") > 1:
    genome_dir = "/".join(snakemake.output[0].split("/")[:-1])

log = snakemake.log

# Finally execute genomepy
shell(
    """
    # set a trap so we can reset to original user's settings
    active_plugins=$(genomepy config show | grep -Po '(?<=- ).*' | paste -s -d, -) || echo ""
    trap "genomepy plugin disable {{{all_plugins}}} >> {log} 2>&1;\
          genomepy plugin enable {{$active_plugins,}} >> {log} 2>&1" EXIT

    # disable all, then enable the ones we need
    genomepy plugin disable {{{all_plugins}}} >  {log} 2>&1
    genomepy plugin enable  {{{req_plugins}}} >> {log} 2>&1

    # install the genome
    genomepy install {snakemake.wildcards.assembly} \
    --provider {provider} {annotation} -g {genome_dir} >> {log} 2>&1
    """
)