GENOMEPY#
Download genomes the easy way: https://github.com/vanheeringen-lab/genomepy
Example#
This wrapper can be used in the following way:
rule genomepy:
output:
multiext(
"{assembly}/{assembly}",
".fa",
".fa.fai",
".fa.sizes",
".gaps.bed",
".annotation.gtf",
".blacklist.bed",
),
log:
"logs/genomepy_{assembly}.log",
params:
provider="ucsc", # optional, defaults to ucsc. Choose from ucsc, ensembl, and ncbi
cache: "omit-software" # mark as eligible for between workflow caching
wrapper:
"v3.0.2-2-g0dea6a1/bio/genomepy"
Note that input, output and log file paths can be chosen freely.
When running with
snakemake --use-conda
the software dependencies will be automatically deployed into an isolated environment before execution.
Software dependencies#
genomepy=0.16.1
Params#
provider
: which provider to download from, defaults to UCSC (choose from UCSC, Ensembl, NCBI).
Code#
__author__ = "Maarten van der Sande"
__copyright__ = "Copyright 2020, Maarten van der Sande"
__email__ = "M.vanderSande@science.ru.nl"
__license__ = "MIT"
from snakemake.shell import shell
# Optional parameters
provider = snakemake.params.get("provider", "ucsc").lower()
# set options for plugins
all_plugins = "blacklist,bowtie2,bwa,gmap,hisat2,minimap2,star"
req_plugins = ","
if any(["blacklist" in out for out in snakemake.output]):
req_plugins = "blacklist,"
annotation = ""
if any(["annotation" in out for out in snakemake.output]):
annotation = "--annotation"
# parse the genome dir
genome_dir = "./"
if snakemake.output[0].count("/") > 1:
genome_dir = "/".join(snakemake.output[0].split("/")[:-1])
log = snakemake.log
# Finally execute genomepy
shell(
"""
# set a trap so we can reset to original user's settings
active_plugins=$(genomepy config show | grep -Po '(?<=- ).*' | paste -s -d, -) || echo ""
trap "genomepy plugin disable {{{all_plugins}}} >> {log} 2>&1;\
genomepy plugin enable {{$active_plugins,}} >> {log} 2>&1" EXIT
# disable all, then enable the ones we need
genomepy plugin disable {{{all_plugins}}} > {log} 2>&1
genomepy plugin enable {{{req_plugins}}} >> {log} 2>&1
# install the genome
genomepy install {snakemake.wildcards.assembly} \
--provider {provider} {annotation} -g {genome_dir} >> {log} 2>&1
"""
)