TRINITY¶
Generate transcriptome assembly with Trinity
Software dependencies¶
- trinity ==2.8.4
Example¶
This wrapper can be used in the following way:
rule trinity:
input:
left=["reads/reads.left.fq.gz", "reads/reads2.left.fq.gz"],
right=["reads/reads.right.fq.gz", "reads/reads2.right.fq.gz"]
output:
"trinity_out_dir/Trinity.fasta"
log:
'logs/trinity/trinity.log'
params:
extra=""
threads: 4
wrapper:
"0.32.0/bio/trinity"
Note that input, output and log file paths can be chosen freely. When running with
snakemake --use-conda
the software dependencies will be automatically deployed into an isolated environment before execution.
Authors¶
- Tessa Pierce
Code¶
"""Snakemake wrapper for Trinity."""
__author__ = "Tessa Pierce"
__copyright__ = "Copyright 2018, Tessa Pierce"
__email__ = "ntpierce@gmail.com"
__license__ = "MIT"
from os import path
from snakemake.shell import shell
extra = snakemake.params.get("extra", "")
max_memory = snakemake.params.get("max_memory", "10G")
#allow multiple input files for single assembly
left = snakemake.input.get("left")
assert left is not None, "input-> left is a required input parameter"
left = [snakemake.input.left] if isinstance(snakemake.input.left, str) else snakemake.input.left
right = snakemake.input.get("right")
if right:
right = [snakemake.input.right] if isinstance(snakemake.input.right, str) else snakemake.input.right
assert len(left) >= len(right), "left input needs to contain at least the same number of files as the right input (can contain additional, single-end files)"
input_str_left = ' --left ' + ",".join(left)
input_str_right = ' --right ' + ",".join(right)
else:
input_str_left = ' --single ' + ",".join(left)
input_str_right = ''
input_cmd = " ".join([input_str_left, input_str_right])
# infer seqtype from input files:
seqtype = snakemake.params.get("seqtype")
if not seqtype:
if 'fq' in left[0] or 'fastq' in left[0]:
seqtype = 'fq'
elif 'fa' in left[0] or 'fasta' in left[0]:
seqtype = 'fa'
else: # assertion is redundant - warning or error instead?
assert seqtype is not None, "cannot infer 'fq' or 'fa' seqtype from input files. Please specify 'fq' or 'fa' in 'seqtype' parameter"
outdir = path.dirname(snakemake.output[0])
assert 'trinity' in outdir, "output directory name must contain 'trinity'"
log = snakemake.log_fmt_shell(stdout=False, stderr=True)
shell("Trinity {input_cmd} --CPU {snakemake.threads} "
" --max_memory {max_memory} --seqType {seqtype} "
" --output {outdir} {snakemake.params.extra} "
" {log}")