.. _`bio/transdecoder/longorfs`: TRANSDECODER LONGORFS ===================== .. image:: https://img.shields.io/github/issues-pr/snakemake/snakemake-wrappers/bio/transdecoder/longorfs?label=version%20update%20pull%20requests :target: https://github.com/snakemake/snakemake-wrappers/pulls?q=is%3Apr+is%3Aopen+label%3Abio/transdecoder/longorfs TransDecoder.LongOrfs will identify coding regions within transcript sequences (ORFs) that are at least 100 amino acids long. You can lower this via the '-m' parameter, but know that the rate of false positive ORF predictions increases drastically with shorter minimum length criteria. Example ------- This wrapper can be used in the following way: .. code-block:: python rule transdecoder_longorfs: input: fasta="test.fa.gz", # required gene_trans_map="test.gtm" # optional gene-to-transcript identifier mapping file (tab-delimited, gene_idtrans_id ) output: "test.fa.transdecoder_dir/longest_orfs.pep" log: "logs/transdecoder/test-longorfs.log" params: extra="" wrapper: "v3.0.1/bio/transdecoder/longorfs" Note that input, output and log file paths can be chosen freely. When running with .. code-block:: bash snakemake --use-conda the software dependencies will be automatically deployed into an isolated environment before execution. Software dependencies --------------------- * ``transdecoder=5.7.1`` Input/Output ------------ **Input:** * fasta transcripts **Output:** * ORFs peptide file(s) Authors ------- * N. Tessa Pierce Code ---- .. code-block:: python """Snakemake wrapper for Transdecoder LongOrfs""" __author__ = "N. Tessa Pierce" __copyright__ = "Copyright 2019, N. Tessa Pierce" __email__ = "ntpierce@gmail.com" __license__ = "MIT" from os import path from snakemake.shell import shell extra = snakemake.params.get("extra", "") log = snakemake.log_fmt_shell(stdout=True, stderr=True) gtm_cmd = "" gtm = snakemake.input.get("gene_trans_map", "") if gtm: gtm_cmd = " --gene_trans_map " + gtm output_dir = path.dirname(str(snakemake.output)) # transdecoder fails if output already exists. No force option available shell("rm -rf {output_dir}") input_fasta = str(snakemake.input.fasta) if input_fasta.endswith("gz"): input_fa = input_fasta.rsplit(".gz")[0] shell("gunzip -c {input_fasta} > {input_fa}") else: input_fa = input_fasta shell("TransDecoder.LongOrfs -t {input_fa} {gtm_cmd} {log}") .. |nl| raw:: html