REFGENIE
Deploy biomedical reference datasets via refgenie.
Example
This wrapper can be used in the following way:
rule obtain_asset:
output:
# the name refers to the refgenie seek key (see attributes on http://refgenomes.databio.org)
fai="refs/genome.fasta"
# Multiple outputs/seek keys are possible here.
params:
genome="human_alu",
asset="fasta",
tag="default"
wrapper:
"v5.0.1/bio/refgenie"
Note that input, output and log file paths can be chosen freely.
When running with
snakemake --use-conda
the software dependencies will be automatically deployed into an isolated environment before execution.
Software dependencies
refgenie=0.12.1
refgenconf=0.12.2
Code
__author__ = "Johannes Köster"
__copyright__ = "Copyright 2019, Johannes Köster"
__email__ = "johannes.koester@uni-due.de"
__license__ = "MIT"
import os
import refgenconf
genome = snakemake.params.genome
asset = snakemake.params.asset
tag = snakemake.params.tag
conf_path = os.environ["REFGENIE"]
rgc = refgenconf.RefGenConf(conf_path, writable=True)
# pull asset if necessary
gat, archive_data, server_url = rgc.pull(genome, asset, tag, force=False)
for seek_key, out in snakemake.output.items():
path = rgc.seek(genome, asset, tag_name=tag, seek_key=seek_key, strict_exists=True)
os.symlink(path, out)