.. _`bio/trinity`: TRINITY ======= .. image:: https://img.shields.io/github/issues-pr/snakemake/snakemake-wrappers/bio/trinity?label=version%20update%20pull%20requests :target: https://github.com/snakemake/snakemake-wrappers/pulls?q=is%3Apr+is%3Aopen+label%3Abio/trinity Generate transcriptome assembly with Trinity **URL**: https://github.com/trinityrnaseq/trinityrnaseq/ Example ------- This wrapper can be used in the following way: .. code-block:: python rule trinity: input: left=["reads/reads.left.fq.gz", "reads/reads2.left.fq.gz"], right=["reads/reads.right.fq.gz", "reads/reads2.right.fq.gz"], output: dir=temp(directory("trinity_out_dir/")), fas="trinity_out_dir.Trinity.fasta", map="trinity_out_dir.Trinity.fasta.gene_trans_map", log: 'logs/trinity/trinity.log', params: extra="", threads: 4 resources: mem_gb=10, wrapper: "v3.0.4/bio/trinity" Note that input, output and log file paths can be chosen freely. When running with .. code-block:: bash snakemake --use-conda the software dependencies will be automatically deployed into an isolated environment before execution. Software dependencies --------------------- * ``trinity=2.15.1`` Input/Output ------------ **Input:** * fastq files **Output:** * ``fas``: fasta containing assembly * ``map``: gene transcripts map * ``dir``: folder for intermediate results Authors ------- * Tessa Pierce Code ---- .. code-block:: python """Snakemake wrapper for Trinity.""" __author__ = "Tessa Pierce" __copyright__ = "Copyright 2018, Tessa Pierce" __email__ = "ntpierce@gmail.com" __license__ = "MIT" from snakemake.shell import shell extra = snakemake.params.get("extra", "") log = snakemake.log_fmt_shell(stdout=True, stderr=True) # Previous wrapper reserved 10 Gigabytes by default. This behaviour is # preserved below: max_memory = "10G" # Getting memory in megabytes, if java opts is not filled with -Xmx parameter # By doing so, backward compatibility is preserved if "mem_mb" in snakemake.resources.keys(): # max_memory from trinity expects a value in gigabytes. rounded_mb_to_gb = int(snakemake.resources["mem_mb"] / 1024) max_memory = "{}G".format(rounded_mb_to_gb) # Getting memory in gigabytes, for user convenience. Please prefer the use # of mem_mb over mem_gb as advised in documentation. elif "mem_gb" in snakemake.resources.keys(): max_memory = "{}G".format(snakemake.resources["mem_gb"]) # allow multiple input files for single assembly left = snakemake.input.get("left") assert left is not None, "input-> left is a required input parameter" left = ( [snakemake.input.left] if isinstance(snakemake.input.left, str) else snakemake.input.left ) right = snakemake.input.get("right") if right: right = ( [snakemake.input.right] if isinstance(snakemake.input.right, str) else snakemake.input.right ) assert len(left) >= len( right ), "left input needs to contain at least the same number of files as the right input (can contain additional, single-end files)" input_str_left = " --left " + ",".join(left) input_str_right = " --right " + ",".join(right) else: input_str_left = " --single " + ",".join(left) input_str_right = "" input_cmd = " ".join([input_str_left, input_str_right]) # infer seqtype from input files: seqtype = snakemake.params.get("seqtype") if not seqtype: if "fq" in left[0] or "fastq" in left[0]: seqtype = "fq" elif "fa" in left[0] or "fas" in left[0] or "fasta" in left[0]: seqtype = "fa" else: # assertion is redundant - warning or error instead? assert ( seqtype is not None ), "cannot infer 'fq' or 'fa' seqtype from input files. Please specify 'fq' or 'fa' in 'seqtype' parameter" shell( "Trinity {input_cmd} --CPU {snakemake.threads} " " --max_memory {max_memory} --seqType {seqtype} " " --output {snakemake.output.dir} {snakemake.params.extra} " " {log}" ) .. |nl| raw:: html