SALMON_INDEX
Index a transcriptome assembly with salmon
Example
This wrapper can be used in the following way:
rule salmon_index:
input:
sequences="assembly/transcriptome.fasta",
output:
multiext(
"salmon/transcriptome_index/",
"index.ssi",
"refseq_offsets.json",
"index.ectab",
"index.ctab",
"refseq.bin",
"index.ssi.mphf",
"index.refinfo",
"info.json",
"duplicate_clusters.tsv",
"index.tct",
"index.tdct",
),
log:
"logs/salmon/transcriptome_index.log",
threads: 2
params:
# optional parameters
extra="",
wrapper:
"v9.11.0/bio/salmon/index"
Note that input, output and log file paths can be chosen freely.
When running with
snakemake --use-conda
the software dependencies will be automatically deployed into an isolated environment before execution.
Software dependencies
salmon=2.0.1
Input/Output
Input:
sequences: Path to sequences to index with Salmon. This can be transcriptome sequences or gentrome.decoys: Optional path to decoy sequences name, in case the above sequence was a gentrome.
Output:
indexed assembly
Params
extra: Optional parameters besides –tmpdir, –threads, and IO.
Code
"""Snakemake wrapper for Salmon Index."""
__author__ = "Tessa Pierce"
__copyright__ = "Copyright 2018, Tessa Pierce"
__email__ = "ntpierce@gmail.com"
__license__ = "MIT"
from os.path import dirname
from snakemake.shell import shell
from tempfile import TemporaryDirectory
log = snakemake.log_fmt_shell(stdout=True, stderr=True)
extra = snakemake.params.get("extra", "")
decoys = snakemake.input.get("decoys", "")
if decoys:
decoys = f"--decoys {decoys}"
output = snakemake.output
if isinstance(output, list) and len(output) > 1:
output = dirname(snakemake.output[0])
with TemporaryDirectory() as tempdir:
shell(
"salmon index "
"--transcripts {snakemake.input.sequences} "
"--index {output} "
"--threads {snakemake.threads} "
"--tmpdir {tempdir} "
"{decoys} "
"{extra} "
"{log}"
)