STAR INDEX

https://img.shields.io/github/issues-pr/snakemake/snakemake-wrappers/bio/star/index?label=version%20update%20pull%20requests

Index fasta sequences with STAR

URL: https://github.com/alexdobin/STAR

Example

This wrapper can be used in the following way:

rule star_index:
    input:
        fasta="{genome}.fasta",
    output:
        directory("{genome}"),
    message:
        "Testing STAR index"
    threads: 1
    params:
        extra="",
    log:
        "logs/star_index_{genome}.log",
    wrapper:
        "v3.9.0-14-g476823b/bio/star/index"

Note that input, output and log file paths can be chosen freely.

When running with

snakemake --use-conda

the software dependencies will be automatically deployed into an isolated environment before execution.

Software dependencies

  • star=2.7.11b

Input/Output

Input:

  • A (multi)fasta formatted file

Output:

  • A directory containing the indexed sequence for downstream STAR mapping

Params

  • sjdbOverhang: length of the donor/acceptor sequence on each side of the junctions (optional)

  • extra: additional program arguments.

Authors

  • Thibault Dayris

  • Tomás Di Domenico

  • Filipe G. Vieira

Code

"""Snakemake wrapper for STAR index"""

__author__ = "Thibault Dayris"
__copyright__ = "Copyright 2019, Dayris Thibault"
__email__ = "thibault.dayris@gustaveroussy.fr"
__license__ = "MIT"

import tempfile
from snakemake.shell import shell
from snakemake.utils import makedirs

log = snakemake.log_fmt_shell(stdout=True, stderr=True)
extra = snakemake.params.get("extra", "")

sjdb_overhang = snakemake.params.get("sjdbOverhang", "")
if sjdb_overhang:
    sjdb_overhang = f"--sjdbOverhang {sjdb_overhang}"

gtf = snakemake.input.get("gtf", "")
if gtf:
    gtf = f"--sjdbGTFfile {gtf}"


with tempfile.TemporaryDirectory() as tmpdir:
    shell(
        "STAR"
        " --runThreadN {snakemake.threads}"  # Number of threads
        " --runMode genomeGenerate"  # Indexation mode
        " --genomeFastaFiles {snakemake.input.fasta}"  # Path to fasta files
        " {sjdb_overhang}"  # Read-len - 1
        " {gtf}"  # Highly recommended GTF
        " {extra}"  # Optional parameters
        " --outTmpDir {tmpdir}/STARtmp"  # Temp dir
        " --genomeDir {snakemake.output}"  # Path to output
        " {log}"  # Logging
    )