NGSDERIVE

https://img.shields.io/badge/wrapper_version-v3.12.2-10785b https://img.shields.io/github/issues-pr/snakemake/snakemake-wrappers/bio/ngsderive?label=version%20update%20pull%20requests&color=1cb481

Backwards computing information from next-generation sequencing data and annotating splice junctions

URL: https://github.com/stjudecloud/ngsderive

Example

This wrapper can be used in the following way:

rule test_ngsderive_endedness:
    input:
        ngs="A.rg.bam",
    output:
        tsv="A.endedness.tsv",
    log:
        "ngsderive/endedness.log",
    params:
        command="endedness",
        extra="--n-reads 2",
    wrapper:
        "v3.12.2/bio/ngsderive"


rule test_ngsderive_junction_annotation:
    input:
        ngs="A.rg.bam",
        gene_model="annotation.sorted.gtf.gz",
    output:
        tsv="A.junctions.tsv",
        junction_dir=directory("junctions"),
    log:
        "ngsderive/junctions.log",
    params:
        command="junction-annotation",
        extra="--min-intron 2 --consider-unannotated-references-novel",
    wrapper:
        "v3.12.2/bio/ngsderive"


rule test_ngsderive_junction_annotation_list:
    input:
        ngs="A.rg.bam",
        gene_model="annotation.sorted.gtf.gz",
    output:
        tsv="A.junctions_list.tsv",
        junction_dir=["junctions/A.rg.bam.junctions.tsv"],
    log:
        "ngsderive/junctions.log",
    params:
        command="junction-annotation",
        extra="--min-intron 2 --consider-unannotated-references-novel",
    wrapper:
        "v3.12.2/bio/ngsderive"


rule test_ngsderive_strandeness:
    input:
        ngs="A.rg.bam",
        gene_model="annotation.sorted.gtf.gz",
    output:
        tsv="A.strandedness.tsv",
    log:
        "ngsderive/strand.log",
    params:
        command="strandedness",
        extra="--verbose --minimum-reads-per-gene 2 --n-genes 1",
    wrapper:
        "v3.12.2/bio/ngsderive"


rule test_ngsderive_encoding:
    input:
        ngs="A.rg.bam",
    output:
        tsv="A.encoding.tsv",
    log:
        "ngsderive/encoding.log",
    params:
        command="encoding",
        extra="--n-reads 2",
    wrapper:
        "v3.12.2/bio/ngsderive"


rule test_ngsderive_instrument:
    input:
        ngs="A.rg.bam",
    output:
        tsv="A.instrument.tsv",
    log:
        "ngsderive/instrument.log",
    params:
        command="instrument",
        extra="--n-reads 2 --verbose",
    wrapper:
        "v3.12.2/bio/ngsderive"


rule test_ngsderive_readlen:
    input:
        ngs="A.rg.bam",
    output:
        tsv="A.readlen.tsv",
    log:
        "ngsderive/readlen.log",
    params:
        command="readlen",
        extra="--majority-vote-cutoff 10 --n-reads 2",
    wrapper:
        "v3.12.2/bio/ngsderive"

Note that input, output and log file paths can be chosen freely.

When running with

snakemake --use-conda

the software dependencies will be automatically deployed into an isolated environment before execution.

Notes

GTF/GFF will be automatically sorted and tabix-indexed by ngsderive if needed.

Software dependencies

  • ngsderive=4.0.0

Input/Output

Input:

  • ngs: Path to BAM/SAM/Fastq file. SAM/BAM files should be indexed.

  • gene_model: Path to sorted GTF/GFF file. Should be tabix indexed.

Output:

  • tsv: Path to output file

  • junctions: Optional path to junction directory, or list of paths to junction files with a common prefix

Params

  • subcommand: Name of the ngsderive subcommand

  • extra: Optional parameters, besides -o, -g

Authors

  • Thibault Dayris

Code

# coding: utf-8

__author__ = "Thibault Dayris"
__mail__ = "thibault.dayris@gustaveroussy.fr"
__copyright__ = "Copyright 2024, Thibault Dayris"
__license__ = "MIT"

from os.path import commonprefix, dirname
from snakemake import shell
from warnings import warn

extra = snakemake.params.get("extra", "")
log = snakemake.log_fmt_shell(stdout=True, stderr=True)

gene_model = snakemake.input.get("gene_model", "")
if gene_model:
    gene_model = f"--gene-model {gene_model}"


junction_dir = snakemake.output.get("junction_dir", "")
if isinstance(junction_dir, list):
    junction_dir = commonprefix([dirname(fp) for fp in junction_dir])
    if not junction_dir:
        warn(
            "No common prefix was found within the list of "
            "files given as `junction_files_dir`. Falling "
            "back to default ngsderive value"
        )

if junction_dir:
    junction_dir = f"--junction-files-dir {junction_dir}"


shell(
    "ngsderive {snakemake.params.command} "
    "{extra} {gene_model} {junction_dir} "
    "{snakemake.input.ngs} "
    "--outfile {snakemake.output.tsv} "
    "{log} "
)