SALMON_INDEX¶
Index a transcriptome assembly with salmon
Example¶
This wrapper can be used in the following way:
rule salmon_index:
input:
sequences="assembly/transcriptome.fasta",
output:
multiext(
"salmon/transcriptome_index/",
"complete_ref_lens.bin",
"ctable.bin",
"ctg_offsets.bin",
"duplicate_clusters.tsv",
"info.json",
"mphf.bin",
"pos.bin",
"pre_indexing.log",
"rank.bin",
"refAccumLengths.bin",
"ref_indexing.log",
"reflengths.bin",
"refseq.bin",
"seq.bin",
"versionInfo.json",
),
log:
"logs/salmon/transcriptome_index.log",
threads: 2
params:
# optional parameters
extra="",
wrapper:
"v1.19.1/bio/salmon/index"
Note that input, output and log file paths can be chosen freely.
When running with
snakemake --use-conda
the software dependencies will be automatically deployed into an isolated environment before execution.
Software dependencies¶
salmon==1.8.0
Input/Output¶
Input:
sequences
: Path to sequences to index with Salmon. This can be transcriptome sequences or gentrome.decoys
: Optional path to decoy sequences name, in case the above sequence was a gentrome.
Output:
- indexed assembly
Params¶
extra
: Optional parameters besides –tmpdir, –threads, and IO.
Authors¶
- Tessa Pierce
- Thibault Dayris
Code¶
"""Snakemake wrapper for Salmon Index."""
__author__ = "Tessa Pierce"
__copyright__ = "Copyright 2018, Tessa Pierce"
__email__ = "ntpierce@gmail.com"
__license__ = "MIT"
from os.path import dirname
from snakemake.shell import shell
from tempfile import TemporaryDirectory
log = snakemake.log_fmt_shell(stdout=True, stderr=True)
extra = snakemake.params.get("extra", "")
decoys = snakemake.input.get("decoys", "")
if decoys:
decoys = f"--decoys {decoys}"
output = snakemake.output
if len(output) > 1:
output = dirname(snakemake.output[0])
with TemporaryDirectory() as tempdir:
shell(
"salmon index "
"--transcripts {snakemake.input.sequences} "
"--index {output} "
"--threads {snakemake.threads} "
"--tmpdir {tempdir} "
"{decoys} "
"{extra} "
"{log}"
)