REFGENIE
Deploy biomedical reference datasets via refgenie.
URL: https://refgenie.databio.org/en/latest/
Example
This wrapper can be used in the following way:
rule obtain_asset:
output:
# the name refers to the refgenie seek key (see attributes on http://refgenomes.databio.org)
fai="refs/genome.fasta"
# Multiple outputs/seek keys are possible here.
params:
genome="human_alu",
asset="fasta",
tag="default"
wrapper:
"v9.5.0/bio/refgenie"
Note that input, output and log file paths can be chosen freely.
When running with
snakemake --use-conda
the software dependencies will be automatically deployed into an isolated environment before execution.
Software dependencies
refgenie=0.13.0refgenconf=0.13.1
Code
__author__ = "Johannes Köster"
__copyright__ = "Copyright 2019, Johannes Köster"
__email__ = "johannes.koester@uni-due.de"
__license__ = "MIT"
import os
import refgenconf
conf_path = os.environ["REFGENIE"]
rgc = refgenconf.RefGenConf.from_yaml_file(conf_path)
# pull asset if necessary
gat, archive_data, server_url = rgc.pull(
snakemake.params.genome, snakemake.params.asset, snakemake.params.tag, force=False
)
for seek_key, out in snakemake.output.items():
path = rgc.seek(
snakemake.params.genome,
snakemake.params.asset,
tag_name=snakemake.params.tag,
seek_key=seek_key,
strict_exists=True,
)
os.symlink(path, out)