.. _`bio/pyroe/makeunspliceunspliced`: PYROE MAKE-SPLICED+UNSPLICED ============================ .. image:: https://img.shields.io/github/issues-pr/snakemake/snakemake-wrappers/bio/pyroe/makeunspliceunspliced?label=version%20update%20pull%20requests :target: https://github.com/snakemake/snakemake-wrappers/pulls?q=is%3Apr+is%3Aopen+label%3Abio/pyroe/makeunspliceunspliced Build spliceu reference files for Alevin-fry. The spliceu (the spliced + unspliced) transcriptome reference, where the unspliced transcripts of each gene represent the entire genomic interval of that gene. **URL**: https://pyroe.readthedocs.io/en/latest/building_splici_index.html#preparing-a-spliced-unspliced-transcriptome-reference Example ------- This wrapper can be used in the following way: .. code-block:: python rule test_pyroe_makesplicedunspliced: input: fasta="genome.fasta", gtf="annotation.gtf", spliced="extra_spliced.fasta", # Optional path to additional spliced sequences (FASTA) unspliced="extra_unspliced.fasta", # Optional path to additional unspliced sequences (FASTA) output: gene_id_to_name="gene_id_to_name.tsv", fasta="spliceu.fa", g2g="spliceu_g2g.tsv", t2g_3col="spliceu_t2g_3col.tsv", t2g="spliceu_t2g.tsv", threads: 1 log: "logs/pyroe.log", params: extra="", # Optional parameters wrapper: "v3.0.1/bio/pyroe/makeunspliceunspliced/" Note that input, output and log file paths can be chosen freely. When running with .. code-block:: bash snakemake --use-conda the software dependencies will be automatically deployed into an isolated environment before execution. Software dependencies --------------------- * ``pyroe=0.9.3`` * ``bedtools=2.31.1`` Input/Output ------------ **Input:** * ``gtf``: Path to the genome annotation (GTF formatted) * ``fasta``: Path to the genome sequence (Fasta formatted) * ``spliced``: Optional path to additional spliced sequences (Fasta formatted) * ``unspliced``: Optional path to unspliced sequences (Fasta formatted) **Output:** * ``fasta``: Path to spliced+unspliced sequences (Fasta formatted) * ``gene_id_to_name``: Path to a TSV formatted text file containing gene_id <-> gene_name correspondence * ``t2g_3col``: Path to a TSV formatted text file containing the transcript_id <-> gene_name <-> splicing status correspondence * ``t2g``: Path to a TSV formatted text file containing the transcript_id <-> gene_name * ``g2g``: Path to a TSV formatted text file containing the gene_id <-> gene_name Params ------ * ``extra``: Optional parameters to be passed to pyroe Authors ------- Code ---- .. code-block:: python __author__ = "Thibault Dayris" __copyright__ = "Copyright 2023, Thibault Dayris" __email__ = "thibault.dayris@gustaveroussy.fr" __license__ = "MIT" from tempfile import TemporaryDirectory from snakemake.shell import shell log = snakemake.log_fmt_shell(stdout=True, stderr=True, append=True) extra = snakemake.params.get("extra", "") spliced = snakemake.input.get("spliced", "") if spliced: spliced = "--extra-spliced " + spliced unspliced = snakemake.input.get("unspliced", "") if unspliced: unspliced = "--extra-unspliced " + unspliced with TemporaryDirectory() as tempdir: shell( "pyroe make-spliced+unspliced " "{extra} {spliced} " "{unspliced} " "{snakemake.input.fasta} " "{snakemake.input.gtf} " "{tempdir} " "{log}" ) if snakemake.output.get("fasta", False): shell("mv --verbose {tempdir}/spliceu.fa {snakemake.output.fasta} {log}") if snakemake.output.get("gene_id_to_name", False): shell( "mv --verbose " "{tempdir}/gene_id_to_name.tsv " "{snakemake.output.gene_id_to_name} {log}" ) if snakemake.output.get("t2g_3col", False): shell( "mv --verbose " "{tempdir}/spliceu_t2g_3col.tsv " "{snakemake.output.t2g_3col} {log} " ) if snakemake.output.get("t2g", False): shell("mv --verbose {tempdir}/spliceu_t2g.tsv {snakemake.output.t2g} {log} ") if snakemake.output.get("g2g", False): shell("mv --verbose {tempdir}/spliceu_g2g.tsv {snakemake.output.g2g} {log} ") .. |nl| raw:: html