The Snakemake Wrappers repository

https://img.shields.io/badge/snakemake-≥3.11.0-brightgreen.svg?style=flat-square

The Snakemake Wrapper Repository is a collection of reusable wrappers that allow to quickly use popular tools from Snakemake rules and workflows.

Usage

The general strategy is to include a wrapper into your workflow via the wrapper directive, e.g.

rule samtools_sort:
    input:
        "mapped/{sample}.bam"
    output:
        "mapped/{sample}.sorted.bam"
    params:
        "-m 4G"
    threads: 8
    wrapper:
        "0.2.0/bio/samtools/sort"

Here, Snakemake will automatically download the corresponding wrapper from https://bitbucket.org/snakemake/snakemake-wrappers/src/0.2.0/bio/samtools/sort/wrapper.py. Thereby, 0.2.0 can be replaced with the version tag you want to use, or a commit id (see here). This ensures reproducibility since changes in the wrapper implementation won’t be propagated automatically to your workflow. Alternatively, e.g., for development, the wrapper directive can also point to full URLs, including the local file://.

Each wrapper defines required software packages and versions. In combination with the --use-conda flag of Snakemake, these will be deployed automatically.

Contribute

We invite anybody to contribute to the Snakemake Wrapper Repository. If you want to contribute we suggest the following procedure:

  • fork the repository
  • develop your contribution
  • perform a pull request

The pull request will be reviewed and included as fast as possible. Thereby, contributions should follow the coding style of the already present examples, i.e.

  • provide a meta.yaml with name, description and author of the wrapper,
  • provide an environment.yaml which lists all required software packages (the packages shall be available via https://anaconda.org),
  • provide an example Snakefile that shows how to use the wrapper,
  • follow the python style guide,
  • use 4 spaces for indentation.

BCFTOOLS

Wrappers

BCFTOOLS CALL

Call variants with bcftools.

Software dependencies
  • samtools ==1.5
  • bcftools ==1.5
Example

This wrapper can be used in the following way:

rule bcftools_call:
    input:
        ref="genome.fasta",
        samples=expand("mapped/{sample}.sorted.bam", sample=config["samples"]),
        indexes=expand("mapped/{sample}.sorted.bam.bai", sample=config["samples"])
    output:
        # Here, we optionally use a region as wildcard and constrain it to the
        # format accepted by samtools mpileup.
        "called/{region,.+(:[0-9]+-[0-9]+)?}.bcf"
    params:
        # Optional parameters for samtools mpileup (except -g, -f).
        # In this example, we forward the region wildcard from the output file to mpileup.
        mpileup="--region {region}",
        # Optional parameters for bcftools call (except -v, -o, -m).
        call=""
    log:
        "logs/bcftools_call/{region}.log"
    wrapper:
        "0.18.0/bio/bcftools/call"

Note that input, output and log file paths can be chosen freely. When running with

snakemake --use-conda

the software dependencies will be automatically deployed into an isolated environment before execution.

Authors
  • Johannes Köster
Code
__author__ = "Johannes Köster"
__copyright__ = "Copyright 2016, Johannes Köster"
__email__ = "koester@jimmy.harvard.edu"
__license__ = "MIT"


from snakemake.shell import shell


shell(
    "(samtools mpileup {snakemake.params.mpileup} {snakemake.input.samples} "
    "--fasta-ref {snakemake.input.ref} --BCF --uncompressed | "
    "bcftools call -m {snakemake.params.call} -o {snakemake.output[0]} -v -) 2> {snakemake.log}")
BCFTOOLS CONCAT

Concatenate vcf/bcf files with bcftools.

Software dependencies
  • bcftools ==1.5
Example

This wrapper can be used in the following way:

rule bcftools_concat:
    input:
        expand("called/{region}.bcf", region=chromosomes)
    output:
        "called/all.bcf"
    params:
        ""  # optional parameters for bcftools concat (except -o)
    wrapper:
        "0.18.0/bio/bcftools/concat"

Note that input, output and log file paths can be chosen freely. When running with

snakemake --use-conda

the software dependencies will be automatically deployed into an isolated environment before execution.

Authors
  • Johannes Köster
Code
__author__ = "Johannes Köster"
__copyright__ = "Copyright 2016, Johannes Köster"
__email__ = "koester@jimmy.harvard.edu"
__license__ = "MIT"


from snakemake.shell import shell


shell(
    "bcftools concat {snakemake.params} -o {snakemake.output[0]} "
    "{snakemake.input}")
BCFTOOLS VIEW

View vcf/bcf file in a different format.

Software dependencies
  • bcftools ==1.5
Example

This wrapper can be used in the following way:

rule bcf_to_vcf:
    input:
        "{prefix}.bcf"
    output:
        "{prefix}.vcf"
    params:
        ""  # optional parameters for bcftools view (except -o)
    wrapper:
        "0.18.0/bio/bcftools/view"

Note that input, output and log file paths can be chosen freely. When running with

snakemake --use-conda

the software dependencies will be automatically deployed into an isolated environment before execution.

Authors
  • Johannes Köster
Code
__author__ = "Johannes Köster"
__copyright__ = "Copyright 2016, Johannes Köster"
__email__ = "koester@jimmy.harvard.edu"
__license__ = "MIT"


from snakemake.shell import shell


shell(
    "bcftools view {snakemake.params} {snakemake.input[0]} "
    "-o {snakemake.output[0]}")

BOWTIE2

Wrappers

BOWTIE2

Map reads with bowtie2.

Software dependencies
  • bowtie2 ==2.3.2
  • samtools ==1.5
Example

This wrapper can be used in the following way:

rule bowtie2:
    input:
        sample=["reads/{sample}.1.fastq", "reads/{sample}.2.fastq"]
    output:
        "mapped/{sample}.bam"
    log:
        "logs/bowtie2/{sample}.log"
    params:
        index="index/genome",  # prefix of reference genome index (built with bowtie2-build)
        extra=""  # optional parameters
    threads: 8
    wrapper:
        "0.18.0/bio/bowtie2/align"

Note that input, output and log file paths can be chosen freely. When running with

snakemake --use-conda

the software dependencies will be automatically deployed into an isolated environment before execution.

Authors
  • Johannes Köster
Code
__author__ = "Johannes Köster"
__copyright__ = "Copyright 2016, Johannes Köster"
__email__ = "koester@jimmy.harvard.edu"
__license__ = "MIT"


from snakemake.shell import shell

extra = snakemake.params.get("extra", "")
log = snakemake.log_fmt_shell(stdout=True, stderr=True)

n = len(snakemake.input.sample)
assert n == 1 or n == 2, "input->sample must have 1 (single-end) or 2 (paired-end) elements."

if n == 1:
    reads = "-U {}".format(*snakemake.input.sample)
else:
    reads = "-1 {} -2 {}".format(*snakemake.input.sample)

shell(
    "(bowtie2 --threads {snakemake.threads} {snakemake.params.extra} "
    "-x {snakemake.params.index} {reads} "
    "| samtools view -Sbh -o {snakemake.output[0]} -) {log}")

BWA

Wrappers

BWA ALN

Map reads with bwa aln.

Software dependencies
  • bwa ==0.7.15
Example

This wrapper can be used in the following way:

rule bwa_aln:
    input:
        "reads/{sample}.{pair}.fastq"
    output:
        "sai/{sample}.{pair}.sai"
    params:
        index="genome",
        extra=""
    log:
        "logs/bwa_aln/{sample}.{pair}.log"
    threads: 8
    wrapper:
        "0.18.0/bio/bwa/aln"

Note that input, output and log file paths can be chosen freely. When running with

snakemake --use-conda

the software dependencies will be automatically deployed into an isolated environment before execution.

Authors
  • Julian de Ruiter
Code
"""Snakemake wrapper for bwa aln."""

__author__ = "Julian de Ruiter"
__copyright__ = "Copyright 2017, Julian de Ruiter"
__email__ = "julianderuiter@gmail.com"
__license__ = "MIT"


from snakemake.shell import shell


extra = snakemake.params.get('extra', '')
log = snakemake.log_fmt_shell(stdout=False, stderr=True)

shell(
    "bwa aln"
    " {extra}"
    " -t {snakemake.threads}"
    " {snakemake.params.index}"
    " {snakemake.input[0]}"
    " > {snakemake.output[0]} {log}")
BWA INDEX

Creates a BWA index.

Software dependencies
  • bwa ==0.7.15
Example

This wrapper can be used in the following way:

rule bwa_index:
    input:
        "{genome}.fasta"
    output:
        "{genome}.amb",
        "{genome}.ann",
        "{genome}.bwt",
        "{genome}.pac",
        "{genome}.sa"
    log:
        "logs/bwa_index/{genome}.log"
    params:
        prefix="{genome}",
        algorithm="bwtsw"
    wrapper:
        "0.18.0/bio/bwa/index"

Note that input, output and log file paths can be chosen freely. When running with

snakemake --use-conda

the software dependencies will be automatically deployed into an isolated environment before execution.

Authors
  • Patrik Smeds
Code
__author__ = "Patrik Smeds"
__copyright__ = "Copyright 2016, Patrik Smeds"
__email__ = "patrik.smeds@gmail.com"
__license__ = "MIT"

from os import path

from snakemake.shell import shell

log = snakemake.log_fmt_shell(stdout=False, stderr=True)

#Check inputs/arguments.
if len(snakemake.input) == 0:
    raise ValueError("A reference genome has to be provided!")
elif len(snakemake.input) > 1:
    raise ValueError("Only one reference genome can be inputed!")

#Prefix that should be used for the database
prefix = snakemake.params.get("prefix", "")

if len(prefix) > 0:
    prefix = "-p " + prefix

#Contrunction algorithm that will be used to build the database, default is bwtsw
construction_algorithm = snakemake.params.get("algorithm", "")

if len(construction_algorithm) != 0:
    construction_algorithm = "-a " + construction_algorithm

shell(
    "bwa index"
    " {prefix}"
    " {construction_algorithm}"
    " {snakemake.input[0]}"
    " {log}")
BWA MEM

Map reads using bwa mem, with optional sorting using samtools or picard.

Software dependencies
  • bwa ==0.7.15
  • samtools ==1.5
  • picard ==2.9.2
Example

This wrapper can be used in the following way:

rule bwa_mem:
    input:
        ["reads/{sample}.1.fastq", "reads/{sample}.2.fastq"]
    output:
        "mapped/{sample}.bam"
    log:
        "logs/bwa_mem/{sample}.log"
    params:
        index="genome",
        extra=r"-R '@RG\tID:{sample}\tSM:{sample}'",
        sort="none",             # Can be 'none', 'samtools' or 'picard'.
        sort_order="queryname",  # Can be 'queryname' or 'coordinate'.
        sort_extra=""            # Extra args for samtools/picard.
    threads: 8
    wrapper:
        "0.18.0/bio/bwa/mem"

Note that input, output and log file paths can be chosen freely. When running with

snakemake --use-conda

the software dependencies will be automatically deployed into an isolated environment before execution.

Authors
  • Johannes Köster
  • Julian de Ruiter
Code
__author__ = "Johannes Köster, Julian de Ruiter"
__copyright__ = "Copyright 2016, Johannes Köster and Julian de Ruiter"
__email__ = "koester@jimmy.harvard.edu, julianderuiter@gmail.com"
__license__ = "MIT"


from os import path

from snakemake.shell import shell


# Extract arguments.
extra = snakemake.params.get("extra", "")

sort = snakemake.params.get("sort", "none")
sort_order = snakemake.params.get("sort_order", "coordinate")
sort_extra = snakemake.params.get("sort_extra", "")

log = snakemake.log_fmt_shell(stdout=False, stderr=True)

# Check inputs/arguments.
if len(snakemake.input) not in {1, 2}:
    raise ValueError("input must have 1 (single-end) or "
                     "2 (paired-end) elements")

if sort_order not in {"coordinate", "queryname"}:
    raise ValueError("Unexpected value for sort_order ({})".format(sort_order))

# Determine which pipe command to use for converting to bam or sorting.
if sort == "none":

    # Simply convert to bam using samtools view.
    pipe_cmd = "samtools view -Sbh -o {snakemake.output[0]} -"

elif sort == "samtools":

    # Sort alignments using samtools sort.
    pipe_cmd = "samtools sort {sort_extra} -o {snakemake.output[0]} -"

    # Add name flag if needed.
    if sort_order == "queryname":
        sort_extra += " -n"

    prefix = path.splitext(snakemake.output[0])[0]
    sort_extra += " -T " + prefix + ".tmp"

elif sort == "picard":

    # Sort alignments using picard SortSam.
    pipe_cmd = ("picard SortSam {sort_extra} INPUT=/dev/stdin"
                " OUTPUT={snakemake.output[0]} SORT_ORDER={sort_order}")

else:
    raise ValueError("Unexpected value for params.sort ({})".format(sort))

shell(
    "(bwa mem"
    " -t {snakemake.threads}"
    " {extra}"
    " {snakemake.params.index}"
    " {snakemake.input}"
    " | " + pipe_cmd + ") {log}")
BWA SAMPE

Map paired-end reads with bwa sampe.

Software dependencies
  • bwa ==0.7.15
  • samtools ==1.3
  • picard ==2.9.2
Example

This wrapper can be used in the following way:

rule bwa_sampe:
    input:
        fastq=["reads/{sample}.1.fastq", "reads/{sample}.2.fastq"],
        sai=["sai/{sample}.1.sai", "sai/{sample}.2.sai"]
    output:
        "mapped/{sample}.bam"
    params:
        index="genome",
        extra=r"-r '@RG\tID:{sample}\tSM:{sample}'", # optional: Extra parameters for bwa.
        sort="none",                                 # optional: Enable sorting. Possible values: 'none', 'samtools' or 'picard'`
        sort_order="queryname",                      # optional: Sort by 'queryname' or 'coordinate'
        sort_extra=""                                # optional: extra arguments for samtools/picard
    log:
        "logs/bwa_sampe/{sample}.log"
    wrapper:
        "0.18.0/bio/bwa/sampe"

Note that input, output and log file paths can be chosen freely. When running with

snakemake --use-conda

the software dependencies will be automatically deployed into an isolated environment before execution.

Authors
  • Julian de Ruiter
Code
"""Snakemake wrapper for bwa sampe."""

__author__ = "Julian de Ruiter"
__copyright__ = "Copyright 2017, Julian de Ruiter"
__email__ = "julianderuiter@gmail.com"
__license__ = "MIT"


from os import path

from snakemake.shell import shell


# Check inputs.
if not len(snakemake.input.sai) == 2:
    raise ValueError('input.sai must have 2 elements')

if not len(snakemake.input.fastq) == 2:
    raise ValueError('input.fastq must have 2 elements')

# Extract arguments.
extra = snakemake.params.get("extra", "")

sort = snakemake.params.get("sort", "none")
sort_order = snakemake.params.get("sort_order", "coordinate")
sort_extra = snakemake.params.get("sort_extra", "")

log = snakemake.log_fmt_shell(stdout=False, stderr=True)

# Determine which pipe command to use for converting to bam or sorting.
if sort == "none":

    # Simply convert to bam using samtools view.
    pipe_cmd = "samtools view -Sbh -o {snakemake.output[0]} -"

elif sort == "samtools":

    # Sort alignments using samtools sort.
    pipe_cmd = "samtools sort {sort_extra} -o {snakemake.output[0]} -"

    # Add name flag if needed.
    if sort_order == "queryname":
        sort_extra += " -n"

    # Use prefix for temp.
    prefix = path.splitext(snakemake.output[0])[0]
    sort_extra += " -T " + prefix + ".tmp"

elif sort == "picard":

    # Sort alignments using picard SortSam.
    pipe_cmd = ("picard SortSam {sort_extra} INPUT=/dev/stdin"
                " OUTPUT={snakemake.output[0]} SORT_ORDER={sort_order}")

else:
    raise ValueError("Unexpected value for params.sort ({})".format(sort))

# Run command.
shell(
    "(bwa sampe"
    " {extra}"
    " {snakemake.params.index}"
    " {snakemake.input.sai}"
    " {snakemake.input.fastq}"
    " | " + pipe_cmd + ") {log}")
BWA SAMSE

Map single-end reads with bwa samse.

Software dependencies
  • bwa ==0.7.15
  • samtools ==1.3
  • picard ==2.9.2
Example

This wrapper can be used in the following way:

rule bwa_samse:
    input:
        fastq="reads/{sample}.1.fastq",
        sai="sai/{sample}.1.sai"
    output:
        "mapped/{sample}.bam"
    params:
        index="genome",
        extra=r"-r '@RG\tID:{sample}\tSM:{sample}'", # optional: Extra parameters for bwa.
        sort="none",                                 # optional: Enable sorting. Possible values: 'none', 'samtools' or 'picard'`
        sort_order="queryname",                      # optional: Sort by 'queryname' or 'coordinate'
        sort_extra=""                                # optional: extra arguments for samtools/picard
    log:
        "logs/bwa_samse/{sample}.log"
    wrapper:
        "0.18.0/bio/bwa/samse"

Note that input, output and log file paths can be chosen freely. When running with

snakemake --use-conda

the software dependencies will be automatically deployed into an isolated environment before execution.

Authors
  • Julian de Ruiter
Code
"""Snakemake wrapper for bwa sampe."""

__author__ = "Julian de Ruiter"
__copyright__ = "Copyright 2017, Julian de Ruiter"
__email__ = "julianderuiter@gmail.com"
__license__ = "MIT"


from os import path

from snakemake.shell import shell


# Extract arguments.
extra = snakemake.params.get("extra", "")

sort = snakemake.params.get("sort", "none")
sort_order = snakemake.params.get("sort_order", "coordinate")
sort_extra = snakemake.params.get("sort_extra", "")

log = snakemake.log_fmt_shell(stdout=False, stderr=True)

# Determine which pipe command to use for converting to bam or sorting.
if sort == "none":

    # Simply convert to bam using samtools view.
    pipe_cmd = "samtools view -Sbh -o {snakemake.output[0]} -"

elif sort == "samtools":

    # Sort alignments using samtools sort.
    pipe_cmd = "samtools sort {sort_extra} -o {snakemake.output[0]} -"

    # Add name flag if needed.
    if sort_order == "queryname":
        sort_extra += " -n"

    # Use prefix for temp.
    prefix = path.splitext(snakemake.output[0])[0]
    sort_extra += " -T " + prefix + ".tmp"

elif sort == "picard":

    # Sort alignments using picard SortSam.
    pipe_cmd = ("picard SortSam {sort_extra} INPUT=/dev/stdin"
                " OUTPUT={snakemake.output[0]} SORT_ORDER={sort_order}")

else:
    raise ValueError("Unexpected value for params.sort ({})".format(sort))

# Run command.
shell(
    "(bwa samse"
    " {extra}"
    " {snakemake.params.index}"
    " {snakemake.input.sai}"
    " {snakemake.input.fastq}"
    " | " + pipe_cmd + ") {log}")

CUTADAPT

Wrappers

CUTADAPT-PE

Trim paired-end reads using cutadapt.

Software dependencies
  • cutadapt ==1.13
Example

This wrapper can be used in the following way:

rule cutadapt:
    input:
        ["reads/{sample}.1.fastq", "reads/{sample}.2.fastq"]
    output:
        fastq1="trimmed/{sample}.1.fastq",
        fastq2="trimmed/{sample}.2.fastq",
        qc="trimmed/{sample}.qc.txt"
    params:
        "-a AGATCGGAAGAGCACACGTCTGAACTCCAGTCAC -q 20"
    log:
        "logs/cutadapt/{sample}.log"
    wrapper:
        "0.18.0/bio/cutadapt/pe"

Note that input, output and log file paths can be chosen freely. When running with

snakemake --use-conda

the software dependencies will be automatically deployed into an isolated environment before execution.

Authors
  • Julian de Ruiter
Code
"""Snakemake wrapper for trimming paired-end reads using cutadapt."""

__author__ = "Julian de Ruiter"
__copyright__ = "Copyright 2017, Julian de Ruiter"
__email__ = "julianderuiter@gmail.com"
__license__ = "MIT"


from snakemake.shell import shell


n = len(snakemake.input)
assert n == 2, "Input must contain 2 (paired-end) elements."

log = snakemake.log_fmt_shell(stdout=False, stderr=True)

shell(
    "cutadapt"
    " {snakemake.params}"
    " -o {snakemake.output.fastq1}"
    " -p {snakemake.output.fastq2}"
    " {snakemake.input}"
    " > {snakemake.output.qc} {log}")
CUTADAPT-SE

Trim single-end reads using cutadapt.

Software dependencies
  • cutadapt ==1.13
Example

This wrapper can be used in the following way:

rule cutadapt:
    input:
        "reads/{sample}.fastq"
    output:
        fastq="trimmed/{sample}.fastq",
        qc="trimmed/{sample}.qc.txt"
    params:
        "-a AGATCGGAAGAGCACACGTCTGAACTCCAGTCAC -q 20"
    log:
        "logs/cutadapt/{sample}.log"
    wrapper:
        "0.18.0/bio/cutadapt/se"

Note that input, output and log file paths can be chosen freely. When running with

snakemake --use-conda

the software dependencies will be automatically deployed into an isolated environment before execution.

Authors
  • Julian de Ruiter
Code
"""Snakemake wrapper for trimming paired-end reads using cutadapt."""

__author__ = "Julian de Ruiter"
__copyright__ = "Copyright 2017, Julian de Ruiter"
__email__ = "julianderuiter@gmail.com"
__license__ = "MIT"


from snakemake.shell import shell


log = snakemake.log_fmt_shell(stdout=False, stderr=True)

shell(
    "cutadapt"
    " {snakemake.params}"
    " -o {snakemake.output.fastq}"
    " {snakemake.input[0]}"
    " > {snakemake.output.qc} {log}")

DELLY

Call variants with delly.

Software dependencies

  • delly ==0.7.7

Example

This wrapper can be used in the following way:

rule delly:
    input:
        ref="genome.fasta",
        samples=["mapped/a.bam"],
        # optional exclude template (see https://github.com/dellytools/delly)
        exclude="human.hg19.excl.tsv"
    output:
        "sv/{type,(DEL|DUP|INV|TRA|INS)}.bcf"
    params:
        vartype="{type}", # variant type to call (can be wildcard, hardcoded string or function)
        extra=""  # optional parameters for delly (except -t, -g)
    log:
        "logs/delly/{type}.log"
    threads: 3
    wrapper:
        "0.18.0/bio/delly"

Note that input, output and log file paths can be chosen freely. When running with

snakemake --use-conda

the software dependencies will be automatically deployed into an isolated environment before execution.

Authors

  • Johannes Köster

Code

__author__ = "Johannes Köster"
__copyright__ = "Copyright 2016, Johannes Köster"
__email__ = "koester@jimmy.harvard.edu"
__license__ = "MIT"


from snakemake.shell import shell


try:
    exclude = "-x " + snakemake.input.exclude
except AttributeError:
    exclude = ""


extra = snakemake.params.get("extra", "")
log = snakemake.log_fmt_shell(stdout=True, stderr=True)

shell(
    "OMP_NUM_THREADS={snakemake.threads} delly call {extra} "
    "{exclude} -t {snakemake.params.vartype} -g {snakemake.input.ref} "
    "-o {snakemake.output[0]} {snakemake.input.samples} {log}")

FASTQ_SCREEN

fastq_screen screens a library of sequences in FASTQ format against a set of sequence databases so you can see if the composition of the library matches with what you expect.

This wrapper allows the configuration to be passed as a filename or as a dictionary in the rule’s params.fastq_screen_config of the rule. So the following configuration file:

DATABASE      ecoli   /data/Escherichia_coli/Bowtie2Index/genome      BOWTIE2
DATABASE      ecoli   /data/Escherichia_coli/Bowtie2Index/genome      BOWTIE
DATABASE      hg19    /data/hg19/Bowtie2Index/genome  BOWTIE2
DATABASE      mm10    /data/mm10/Bowtie2Index/genome  BOWTIE2
BOWTIE        /path/to/bowtie
BOWTIE2       /path/to/bowtie2

becomes:

fastq_screen_config = {
 'database': {
   'ecoli': {
     'bowtie2': '/data/Escherichia_coli/Bowtie2Index/genome',
     'bowtie': '/data/Escherichia_coli/BowtieIndex/genome'},
   'hg19': {
     'bowtie2': '/data/hg19/Bowtie2Index/genome'},
   'mm10': {
     'bowtie2': '/data/mm10/Bowtie2Index/genome'}
 },
 'aligner_paths': {'bowtie': 'bowtie', 'bowtie2': 'bowtie2'}
}

By default, the wrapper will use bowtie2 as the aligner and a subset of 100000 reads. These can be overridden using params.aligner and params.subset respectively. Furthermore, params.extra can be used to pass additional arguments verbatim to fastq_screen, for example extra="--illumina1_3" or extra="--bowtie2 '--trim5=8'".

Software dependencies

  • fastq-screen ==0.5.2
  • bowtie2 ==2.2.6
  • bowtie ==1.1.2

Example

This wrapper can be used in the following way:

rule fastq_screen:
    input:
        "samples/{sample}.fastq.gz"
    output:
        txt="qc/{sample}.fastq_screen.txt",
        png="qc/{sample}.fastq_screen.png"
    params:
        fastq_screen_config=fastq_screen_config,
        subset=100000,
        aligner='bowtie2'
    threads: 8
    wrapper:
        "0.18.0/bio/fastq_screen"

Note that input, output and log file paths can be chosen freely. When running with

snakemake --use-conda

the software dependencies will be automatically deployed into an isolated environment before execution.

Notes

  • fastq_screen hard-codes the output filenames. This wrapper moves the hard-coded output files to those specified by the rule.
  • While the dictionary form of fastq_screen_config is convenient, the unordered nature of the dictionary may cause snakemake --list-params-changed to incorrectly report changed parameters even though the contents remain the same. If you plan on using --list-params-changed then it will be better to write a config file and pass that as fastq_screen_config. This problem will disappear with Python 3.6.
  • When providing the dictionary form of fastq_screen_config, the wrapper will write a temp file using Python’s tempfile module. To control the temp file directory, make sure the $TMPDIR environmental variable is set (see the tempfile docs) for details). One way of doing this is by adding something like shell.prefix("export TMPDIR=/scratch; ") to the snakefile calling this wrapper.

Authors

  • Ryan Dale

Code

import os
from snakemake.shell import shell
import tempfile

__author__ = "Ryan Dale"
__copyright__ = "Copyright 2016, Ryan Dale"
__email__ = "dalerr@niddk.nih.gov"
__license__ = "MIT"

_config = snakemake.params['fastq_screen_config']

subset = snakemake.params.get('subset', 100000)
aligner = snakemake.params.get('aligner', 'bowtie2')
extra = snakemake.params.get('extra', '')
log = snakemake.log_fmt_shell()

# snakemake.params.fastq_screen_config can be either a dict or a string. If
# string, interpret as a filename pointing to the fastq_screen config file.
# Otherwise, create a new tempfile out of the contents of the dict:
if isinstance(_config, dict):
    tmp = tempfile.NamedTemporaryFile(delete=False).name
    with open(tmp, 'w') as fout:
        for label, indexes in _config['database'].items():
            for aligner, index in indexes.items():
                fout.write('\t'.join([
                    'DATABASE', label, index, aligner.upper()]) + '\n')
        for aligner, path in _config['aligner_paths'].items():
            fout.write('\t'.join([aligner.upper(), path]) + '\n')
    config_file = tmp
else:
    config_file = _config

# fastq_screen hard-codes filenames according to this prefix. We will send
# hard-coded output to a temp dir, and then move them later.
prefix = os.path.basename(snakemake.input[0].split('.fastq')[0])
tempdir = tempfile.mkdtemp()

shell(
    "fastq_screen --outdir {tempdir} "
    "--force "
    "--aligner {aligner} "
    "--conf {config_file} "
    "--subset {subset} "
    "--threads {snakemake.threads} "
    "{extra} "
    "{snakemake.input[0]} "
    "{log}"
)

# Move output to the filenames specified by the rule
shell("mv {tempdir}/{prefix}_screen.txt {snakemake.output.txt}")
shell("mv {tempdir}/{prefix}_screen.png {snakemake.output.png}")

# Clean up temp
shell("rm -r {tempdir}")
if isinstance(_config, dict):
    shell("rm {tmp}")

FASTQC

Generate fastq qc statistics using fastqc.

Software dependencies

  • fontconfig ==2.12.1
  • fastqc ==0.11.5

Example

This wrapper can be used in the following way:

rule fastqc:
    input:
        "reads/{sample}.fastq"
    output:
        html="qc/{sample}.html",
        zip="qc/{sample}.zip"
    params: ""
    wrapper:
        "0.18.0/bio/fastqc"

Note that input, output and log file paths can be chosen freely. When running with

snakemake --use-conda

the software dependencies will be automatically deployed into an isolated environment before execution.

Authors

  • Julian de Ruiter

Code

"""Snakemake wrapper for fastqc."""

__author__ = "Julian de Ruiter"
__copyright__ = "Copyright 2017, Julian de Ruiter"
__email__ = "julianderuiter@gmail.com"
__license__ = "MIT"


from os import path

from snakemake.shell import shell


def basename_without_ext(file_path):
    """Returns basename of file path, without the file extension."""

    base = path.basename(file_path)

    split_ind = 2 if base.endswith(".gz") else 1
    base = ".".join(base.split(".")[:-split_ind])

    return base


# Run fastqc.
output_dir = path.dirname(snakemake.output.html)

shell("fastqc {snakemake.params} --quiet "
      "--outdir {output_dir} {snakemake.input[0]}")

# Move outputs into proper position.
output_base = basename_without_ext(snakemake.input[0])
html_path = path.join(output_dir, output_base + "_fastqc.html")
zip_path = path.join(output_dir, output_base + "_fastqc.zip")

if snakemake.output.html != html_path:
    shell("mv {html_path} {snakemake.output.html}")

if snakemake.output.zip != zip_path:
    shell("mv {zip_path} {snakemake.output.zip}")

FREEBAYES

Call small genomic variants with freebayes.

Software dependencies

  • freebayes ==1.1.0
  • bcftools ==1.5

Example

This wrapper can be used in the following way:

rule freebayes:
    input:
        ref="genome.fasta",
        # you can have a list of samples here
        samples="mapped/{sample}.bam"
    output:
        "calls/{sample}.vcf"  # either .vcf or .bcf
    log:
        "logs/freebayes/{sample}.log"
    params:
        ""  # optional parameters
    wrapper:
        "0.18.0/bio/freebayes"

Note that input, output and log file paths can be chosen freely. When running with

snakemake --use-conda

the software dependencies will be automatically deployed into an isolated environment before execution.

Authors

  • Johannes Köster

Code

__author__ = "Johannes Köster"
__copyright__ = "Copyright 2017, Johannes Köster"
__email__ = "johannes.koester@protonmail.com"
__license__ = "MIT"


from snakemake.shell import shell

log = snakemake.log_fmt_shell(stdout=False, stderr=True)

pipe = ""
if snakemake.output[0].endswith(".bcf"):
    pipe = "| bcftools view -Ob -"

shell("(freebayes {snakemake.params} -f {snakemake.input.ref} "
      " {snakemake.input.samples} {pipe} > {snakemake.output[0]}) {log}")

HISAT2

Map reads with hisat2.

Software dependencies

  • hisat2 ==2.1.0
  • samtools ==1.5

Example

This wrapper can be used in the following way:

rule hisat2:
    input:
      reads=["reads/{sample}.1.fastq.gz", "reads/{sample}.2.fastq.gz"],
    output:
      "mapped/{sample}.bam"
    log:                                # optional
      "logs/hisat2/{sample}.log"
    params:                             # idx is required, extra is optional
      idx="genome.fa",
      extra="--min-intronlen 1000"
    threads: 8                          # optional, defaults to 1
    wrapper:
      "0.18.0/bio/hisat2"

Note that input, output and log file paths can be chosen freely. When running with

snakemake --use-conda

the software dependencies will be automatically deployed into an isolated environment before execution.

Notes

  • The -S flag must not be used since output is already directly piped to samtools for compression.
  • The –threads/-p flag must not be used since threads is set separately via the snakemake threads directive.
  • The wrapper does not yet handle SRA input accessions.
  • No reference index files checking is done since the actual number of files may differ depending on the reference sequence size. This is also why the index is supplied in the params directive instead of the input directive.

Authors

  • Wibowo Arindrarto

Code

__author__ = "Wibowo Arindrarto"
__copyright__ = "Copyright 2016, Wibowo Arindrarto"
__email__ = "bow@bow.web.id"
__license__ = "BSD"


from snakemake.shell import shell

# Placeholder for optional parameters
extra = snakemake.params.get("extra", "")
# Run log
log = snakemake.log_fmt_shell()

# Input file wrangling
reads = snakemake.input.get("reads")
if isinstance(reads, str):
    input_flags = "-U {0}".format(reads)
elif len(reads) == 1:
    input_flags = "-U {0}".format(reads[0])
elif len(reads) == 2:
    input_flags = "-1 {0} -2 {1}".format(*reads)
else:
    raise RuntimeError(
        "Reads parameter must contain at least 1 and at most 2"
        " input files.")

# Executed shell command
shell(
    "(hisat2 {extra} --threads {snakemake.threads}"
    " -x {snakemake.params.idx} {input_flags}"
    " | samtools view -Sbh -o {snakemake.output[0]} -)"
    " {log}")

MULTIQC

Generate qc report using multiqc.

Software dependencies

  • multiqc ==1.2
  • networkx <2.0

Example

This wrapper can be used in the following way:

rule multiqc:
    input:
        expand("samtools_stats/{sample}.txt", sample=["a", "b"])
    output:
        "qc/multiqc.html"
    params:
        ""  # Optional: extra parameters for multiqc.
    log:
        "logs/multiqc.log"
    wrapper:
        "0.18.0/bio/multiqc"

Note that input, output and log file paths can be chosen freely. When running with

snakemake --use-conda

the software dependencies will be automatically deployed into an isolated environment before execution.

Authors

  • Julian de Ruiter

Code

"""Snakemake wrapper for trimming paired-end reads using cutadapt."""

__author__ = "Julian de Ruiter"
__copyright__ = "Copyright 2017, Julian de Ruiter"
__email__ = "julianderuiter@gmail.com"
__license__ = "MIT"


from os import path

from snakemake.shell import shell


input_dirs = set(path.dirname(fp) for fp in snakemake.input)
output_dir = path.dirname(snakemake.output[0])
output_name = path.basename(snakemake.output[0])
log = snakemake.log_fmt_shell(stdout=True, stderr=True)

shell(
    "multiqc"
    " {snakemake.params}"
    " --force"
    " -o {output_dir}"
    " -n {output_name}"
    " {input_dirs}"
    " {log}")

NGS-DISAMBIGUATE

Disambiguation algorithm for reads aligned to two species (e.g. human and mouse genomes) from Tophat, Hisat2, STAR or BWA mem.

Software dependencies

  • ngs-disambiguate ==2016.11.10

Example

This wrapper can be used in the following way:

rule disambiguate:
    input:
        a="mapped/{sample}.a.bam",
        b="mapped/{sample}.b.bam"
    output:
        a_ambiguous='disambiguate/{sample}.graft.ambiguous.bam',
        b_ambiguous='disambiguate/{sample}.host.ambiguous.bam',
        a_disambiguated='disambiguate/{sample}.graft.bam',
        b_disambiguated='disambiguate/{sample}.host.bam',
        summary='qc/disambiguate/{sample}.txt'
    params:
        algorithm="bwa",
        # optional: Prefix to use for output. If omitted, a
        # suitable value is guessed from the output paths. Prefix
        # is used for the intermediate output paths, as well as
        # sample name in summary file.
        prefix="{sample}",
        extra=""
    wrapper:
        "0.18.0/bio/ngs-disambiguate"

Note that input, output and log file paths can be chosen freely. When running with

snakemake --use-conda

the software dependencies will be automatically deployed into an isolated environment before execution.

Authors

  • Julian de Ruiter

Code

"""Snakemake wrapper for ngs-disambiguate (from Astrazeneca)."""

__author__ = "Julian de Ruiter"
__copyright__ = "Copyright 2017, Julian de Ruiter"
__email__ = "julianderuiter@gmail.com"
__license__ = "MIT"


from os import path

from snakemake.shell import shell


# Extract arguments.
prefix = snakemake.params.get("prefix", None)
extra = snakemake.params.get("extra", "")

output_dir = path.dirname(snakemake.output.a_ambiguous)
log = snakemake.log_fmt_shell(stdout=False, stderr=True)

# If prefix is not given, we use the summary path to derive the most
# probable sample name (as the summary path is least likely to contain)
# additional suffixes. This is better than using a random id as prefix,
# the prefix is also used as the sample name in the summary file.
if prefix is None:
    prefix = path.splitext(path.basename(snakemake.output.summary))[0]

# Run command.
shell(
    "ngs_disambiguate"
    " {extra}"
    " -o {output_dir}"
    " -s {prefix}"
    " -a {snakemake.params.algorithm}"
    " {snakemake.input.a}"
    " {snakemake.input.b}")

# Move outputs into expected positions.
output_base = path.join(output_dir, prefix)

output_map = {
    output_base + ".ambiguousSpeciesA.bam":
        snakemake.output.a_ambiguous,
    output_base + ".ambiguousSpeciesB.bam":
        snakemake.output.b_ambiguous,
    output_base + ".disambiguatedSpeciesA.bam":
        snakemake.output.a_disambiguated,
    output_base + ".disambiguatedSpeciesB.bam":
        snakemake.output.b_disambiguated,
    output_base + "_summary.txt":
        snakemake.output.summary
}

for src, dest in output_map.items():
    if src != dest:
        shell('mv {src} {dest}')

PICARD

Wrappers

PICARD ADDORREPLACEREADGROUPS

Add or replace read groups with picard tools.

Software dependencies
  • picard ==2.9.2
Example

This wrapper can be used in the following way:

rule replace_rg:
    input:
        "mapped/{sample}.bam"
    output:
        "fixed-rg/{sample}.bam"
    log:
        "logs/picard/replace_rg/{sample}.log"
    params:
        "RGLB=lib1 RGPL=illumina RGPU={sample} RGSM={sample}"
    wrapper:
        "0.18.0/bio/picard/addorreplacereadgroups"

Note that input, output and log file paths can be chosen freely. When running with

snakemake --use-conda

the software dependencies will be automatically deployed into an isolated environment before execution.

Authors
  • Johannes Köster
Code
__author__ = "Johannes Köster"
__copyright__ = "Copyright 2016, Johannes Köster"
__email__ = "koester@jimmy.harvard.edu"
__license__ = "MIT"


from snakemake.shell import shell


shell("picard AddOrReplaceReadGroups {snakemake.params} I={snakemake.input} "
      "O={snakemake.output} &> {snakemake.log}")
PICARD COLLECTALIGNMENTSUMMARYMETRICS

Collect metrics on aligned reads with picard tools.

Software dependencies
  • picard ==2.9.2
Example

This wrapper can be used in the following way:

rule alignment_summary:
    input:
        ref="genome.fasta",
        bam="mapped/{sample}.bam"
    output:
        "stats/{sample}.summary.txt"
    log:
        "logs/picard/alignment-summary/{sample}.log"
    params:
        # optional parameters (e.g. relax checks as below)
        "VALIDATION_STRINGENCY=LENIENT "
        "METRIC_ACCUMULATION_LEVEL=null "
        "METRIC_ACCUMULATION_LEVEL=SAMPLE"
    wrapper:
        "0.18.0/bio/picard/collectalignmentsummarymetrics"

Note that input, output and log file paths can be chosen freely. When running with

snakemake --use-conda

the software dependencies will be automatically deployed into an isolated environment before execution.

Authors
  • Johannes Köster
Code
__author__ = "Johannes Köster"
__copyright__ = "Copyright 2016, Johannes Köster"
__email__ = "johannes.koester@protonmail.com"
__license__ = "MIT"


from snakemake.shell import shell


log = snakemake.log_fmt_shell()


shell("picard CollectAlignmentSummaryMetrics {snakemake.params} "
      "INPUT={snakemake.input.bam} OUTPUT={snakemake.output[0]} "
      "REFERENCE_SEQUENCE={snakemake.input.ref} {log}")
PICARD COLLECTHSMETRICS

Collects hybrid-selection (HS) metrics for a SAM or BAM file using picard.

Software dependencies
  • picard ==2.9.2
Example

This wrapper can be used in the following way:

rule picard_collect_hs_metrics:
    input:
        bam="mapped/{sample}.bam",
        reference="genome.fasta",
        # Baits and targets should be given as interval lists. These can
        # be generated from bed files using picard BedToIntervalList.
        bait_intervals="regions.intervals",
        target_intervals="regions.intervals"
    output:
        "stats/hs_metrics/{sample}.txt"
    params:
        # Optional extra arguments. Here we reduce sample size
        # to reduce the runtime in our unit test.
        "SAMPLE_SIZE=1000"
    log:
        "logs/picard_collect_hs_metrics/{sample}.log"
    wrapper:
        "0.18.0/bio/picard/collecthsmetrics"

Note that input, output and log file paths can be chosen freely. When running with

snakemake --use-conda

the software dependencies will be automatically deployed into an isolated environment before execution.

Authors
  • Julian de Ruiter
Code
"""Snakemake wrapper for picard CollectHSMetrics."""

__author__ = "Julian de Ruiter"
__copyright__ = "Copyright 2017, Julian de Ruiter"
__email__ = "julianderuiter@gmail.com"
__license__ = "MIT"


from snakemake.shell import shell


inputs = " ".join("INPUT={}".format(in_) for in_ in snakemake.input)
extra = snakemake.params.get("extra", "")
log = snakemake.log_fmt_shell(stdout=False, stderr=True)

shell(
    "picard CollectHsMetrics"
    " {extra}"
    " INPUT={snakemake.input.bam}"
    " OUTPUT={snakemake.output[0]}"
    " REFERENCE_SEQUENCE={snakemake.input.reference}"
    " BAIT_INTERVALS={snakemake.input.bait_intervals}"
    " TARGET_INTERVALS={snakemake.input.target_intervals}"
    " {log}")
PICARD COLLECTINSERTSIZEMETRICS

Collect metrics on insert size of paired end reads with picard tools.

Software dependencies
  • picard ==2.9.2
  • r-base ==3.3.2
Example

This wrapper can be used in the following way:

rule insert_size:
    input:
        "mapped/{sample}.bam"
    output:
        txt="stats/{sample}.isize.txt",
        pdf="stats/{sample}.isize.pdf"
    log:
        "logs/picard/insert_size/{sample}.log"
    params:
        # optional parameters (e.g. relax checks as below)
        "VALIDATION_STRINGENCY=LENIENT "
        "METRIC_ACCUMULATION_LEVEL=null "
        "METRIC_ACCUMULATION_LEVEL=SAMPLE"
    wrapper:
        "0.18.0/bio/picard/collectinsertsizemetrics"

Note that input, output and log file paths can be chosen freely. When running with

snakemake --use-conda

the software dependencies will be automatically deployed into an isolated environment before execution.

Authors
  • Johannes Köster
Code
__author__ = "Johannes Köster"
__copyright__ = "Copyright 2016, Johannes Köster"
__email__ = "johannes.koester@protonmail.com"
__license__ = "MIT"


from snakemake.shell import shell


log = snakemake.log_fmt_shell()


shell("picard CollectInsertSizeMetrics {snakemake.params} "
      "INPUT={snakemake.input} OUTPUT={snakemake.output.txt} "
      "HISTOGRAM_FILE={snakemake.output.pdf} {log}")
PICARD MARKDUPLICATES

Mark PCR and optical duplicates with picard tools.

Software dependencies
  • picard ==2.9.2
Example

This wrapper can be used in the following way:

rule mark_duplicates:
    input:
        "mapped/{sample}.bam"
    output:
        bam="dedup/{sample}.bam",
        metrics="dedup/{sample}.metrics.txt"
    log:
        "logs/picard/dedup/{sample}.log"
    params:
        "REMOVE_DUPLICATES=true"
    wrapper:
        "0.18.0/bio/picard/markduplicates"

Note that input, output and log file paths can be chosen freely. When running with

snakemake --use-conda

the software dependencies will be automatically deployed into an isolated environment before execution.

Authors
  • Johannes Köster
Code
__author__ = "Johannes Köster"
__copyright__ = "Copyright 2016, Johannes Köster"
__email__ = "koester@jimmy.harvard.edu"
__license__ = "MIT"


from snakemake.shell import shell


shell("picard MarkDuplicates {snakemake.params} INPUT={snakemake.input} "
      "OUTPUT={snakemake.output.bam} METRICS_FILE={snakemake.output.metrics} "
      "&> {snakemake.log}")
PICARD MERGESAMFILES

Merge sam/bam files using picard tools.

Software dependencies
  • picard ==2.9.2
Example

This wrapper can be used in the following way:

rule merge_bams:
    input:
        expand("mapped/{sample}.bam", sample=["a", "b"])
    output:
        "merged.bam"
    log:
        "logs/picard_mergesamfiles.log"
    params:
        "VALIDATION_STRINGENCY=LENIENT"
    wrapper:
        "0.18.0/bio/picard/mergesamfiles"

Note that input, output and log file paths can be chosen freely. When running with

snakemake --use-conda

the software dependencies will be automatically deployed into an isolated environment before execution.

Authors
  • Julian de Ruiter
Code
"""Snakemake wrapper for picard MergeSamFiles."""

__author__ = "Julian de Ruiter"
__copyright__ = "Copyright 2017, Julian de Ruiter"
__email__ = "julianderuiter@gmail.com"
__license__ = "MIT"


from snakemake.shell import shell


inputs = " ".join("INPUT={}".format(in_) for in_ in snakemake.input)
log = snakemake.log_fmt_shell(stdout=False, stderr=True)

shell(
    "picard"
    " MergeSamFiles"
    " {snakemake.params}"
    " {inputs}"
    " OUTPUT={snakemake.output[0]}"
    " {log}")
PICARD SORTSAM

Sort sam/bam files using picard tools.

Software dependencies
  • picard ==2.9.2
Example

This wrapper can be used in the following way:

rule sort_bam:
    input:
        "mapped/{sample}.bam"
    output:
        "sorted/{sample}.bam"
    log:
        "logs/picard/sort_sam/{sample}.log"
    params:
        sort_order="coordinate",
        extra="VALIDATION_STRINGENCY=LENIENT" # optional: Extra arguments for picard.
    wrapper:
        "0.18.0/bio/picard/sortsam"

Note that input, output and log file paths can be chosen freely. When running with

snakemake --use-conda

the software dependencies will be automatically deployed into an isolated environment before execution.

Authors
  • Julian de Ruiter
Code
"""Snakemake wrapper for picard SortSam."""

__author__ = "Julian de Ruiter"
__copyright__ = "Copyright 2017, Julian de Ruiter"
__email__ = "julianderuiter@gmail.com"
__license__ = "MIT"


from snakemake.shell import shell


extra = snakemake.params.get("extra", "")
log = snakemake.log_fmt_shell(stdout=False, stderr=True)

shell(
    'picard'
    ' SortSam'
    ' {extra}'
    ' INPUT={snakemake.input[0]}'
    ' OUTPUT={snakemake.output[0]}'
    ' SORT_ORDER={snakemake.params.sort_order}'
    ' {log}')

PINDEL

Wrappers

PINDEL

Call variants with pindel.

Software dependencies
  • pindel ==0.2.5b8
Example

This wrapper can be used in the following way:

pindel_types = ["D", "BP", "INV", "TD", "LI", "SI", "RP"]


rule pindel:
    input:
        ref="genome.fasta",
        # samples to call
        samples=["mapped/a.bam"],
        # bam configuration file, see http://gmt.genome.wustl.edu/packages/pindel/quick-start.html
        config="pindel_config.txt"
    output:
        expand("pindel/all_{type}", type=pindel_types)
    params:
        # prefix must be consistent with output files
        prefix="pindel/all",
        extra=""  # optional parameters (except -i, -f, -o)
    log:
        "logs/pindel.log"
    threads: 4
    wrapper:
        "0.18.0/bio/pindel/call"

Note that input, output and log file paths can be chosen freely. When running with

snakemake --use-conda

the software dependencies will be automatically deployed into an isolated environment before execution.

Authors
  • Johannes Köster
Code
__author__ = "Johannes Köster"
__copyright__ = "Copyright 2016, Johannes Köster"
__email__ = "koester@jimmy.harvard.edu"
__license__ = "MIT"

import os
from snakemake.shell import shell

extra = snakemake.params.get("extra", "")
log = snakemake.log_fmt_shell(stdout=True, stderr=True)

shell("pindel {snakemake.params.extra} -i {snakemake.input.config} -f {snakemake.input.ref} -o {snakemake.params.prefix} {log}")
PINDEL2VCF

Convert pindel output to vcf.

Software dependencies
  • pindel ==0.2.5b8
Example

This wrapper can be used in the following way:

rule pindel2vcf:
    input:
        ref="genome.fasta",
        pindel="pindel/all_{type}"
    output:
        "pindel/all_{type}.vcf"
    params:
        refname="hg38",  # mandatory, see pindel manual
        refdate="20170110",  # mandatory, see pindel manual
        extra=""  # extra params (except -r, -p, -R, -d, -v)
    log:
        "logs/pindel/pindel2vcf.{type}.log"
    wrapper:
        "0.18.0/bio/pindel/pindel2vcf"

Note that input, output and log file paths can be chosen freely. When running with

snakemake --use-conda

the software dependencies will be automatically deployed into an isolated environment before execution.

Authors
  • Johannes Köster
Code
__author__ = "Johannes Köster"
__copyright__ = "Copyright 2016, Johannes Köster"
__email__ = "koester@jimmy.harvard.edu"
__license__ = "MIT"

import os
from snakemake.shell import shell

extra = snakemake.params.get("extra", "")
log = snakemake.log_fmt_shell(stdout=True, stderr=True)

shell("pindel2vcf {snakemake.params.extra} -p {snakemake.input.pindel} -r {snakemake.input.ref} -R {snakemake.params.refname} -d {snakemake.params.refdate} -v {snakemake.output[0]} {log}")

SAMBAMBA

Wrappers

SAMBAMBA SORT

Sort bam file with sambamba

Software dependencies
  • sambamba ==0.6.6
Example

This wrapper can be used in the following way:

rule sambamba_sort:
    input:
        "mapped/{sample}.bam"
    output:
        "mapped/{sample}.sorted.bam"
    params:
        ""  # optional parameters
    threads: 8
    wrapper:
        "0.18.0/bio/sambamba/sort"

Note that input, output and log file paths can be chosen freely. When running with

snakemake --use-conda

the software dependencies will be automatically deployed into an isolated environment before execution.

Authors
  • Johannes Köster
Code
__author__ = "Johannes Köster"
__copyright__ = "Copyright 2016, Johannes Köster"
__email__ = "koester@jimmy.harvard.edu"
__license__ = "MIT"


import os
from snakemake.shell import shell

shell(
    "sambamba sort {snakemake.params} -t {snakemake.threads} "
    "-o {snakemake.output[0]} {snakemake.input[0]}")

SAMTOOLS

Wrappers

SAMTOOLS INDEX

Index bam file with samtools.

Software dependencies
  • samtools ==1.5
Example

This wrapper can be used in the following way:

rule samtools_index:
    input:
        "A.sorted.bam"
    output:
        "A.sorted.bam.bai"
    params:
        "" # optional params string
    wrapper:
        "0.18.0/bio/samtools/index"

Note that input, output and log file paths can be chosen freely. When running with

snakemake --use-conda

the software dependencies will be automatically deployed into an isolated environment before execution.

Authors
  • Johannes Köster
Code
__author__ = "Johannes Köster"
__copyright__ = "Copyright 2016, Johannes Köster"
__email__ = "koester@jimmy.harvard.edu"
__license__ = "MIT"


from snakemake.shell import shell


shell("samtools index {snakemake.params} {snakemake.input[0]} {snakemake.output[0]}")
SAMTOOLS MERGE

Merge two bam files with samtools.

Software dependencies
  • samtools ==1.5
Example

This wrapper can be used in the following way:

rule samtools_merge:
    input:
        ["mapped/A.bam", "mapped/B.bam"]
    output:
        "merged.bam"
    params:
        "" # optional additional parameters as string
    threads: 8
    wrapper:
        "0.18.0/bio/samtools/merge"

Note that input, output and log file paths can be chosen freely. When running with

snakemake --use-conda

the software dependencies will be automatically deployed into an isolated environment before execution.

Authors
  • Johannes Köster
Code
__author__ = "Johannes Köster"
__copyright__ = "Copyright 2016, Johannes Köster"
__email__ = "koester@jimmy.harvard.edu"
__license__ = "MIT"


from snakemake.shell import shell


shell("samtools merge --threads {snakemake.threads} {snakemake.params} "
      "{snakemake.output[0]} {snakemake.input}")
SAMTOOLS SORT

Sort bam file with samtools.

Software dependencies
  • samtools ==1.5
Example

This wrapper can be used in the following way:

rule samtools_sort:
    input:
        "mapped/{sample}.bam"
    output:
        "mapped/{sample}.sorted.bam"
    params:
        "-m 4G"
    threads: 8
    wrapper:
        "0.18.0/bio/samtools/sort"

Note that input, output and log file paths can be chosen freely. When running with

snakemake --use-conda

the software dependencies will be automatically deployed into an isolated environment before execution.

Authors
  • Johannes Köster
Code
__author__ = "Johannes Köster"
__copyright__ = "Copyright 2016, Johannes Köster"
__email__ = "koester@jimmy.harvard.edu"
__license__ = "MIT"


import os
from snakemake.shell import shell


prefix = os.path.splitext(snakemake.output[0])[0]

shell(
    "samtools sort {snakemake.params} -@ {snakemake.threads} -o {snakemake.output[0]} "
    "-T {prefix} {snakemake.input[0]}")
SAMTOOLS STATS

Generate stats using samtools.

Software dependencies
  • samtools ==1.5
Example

This wrapper can be used in the following way:

rule samtools_stats:
    input:
        "mapped/{sample}.bam"
    output:
        "samtools_stats/{sample}.txt"
    params:
        extra="",                       # Optional: extra arguments.
        region="1:1000000-2000000"      # Optional: region string.
    log:
        "logs/samtools_stats/{sample}.log"
    wrapper:
        "0.18.0/bio/samtools/stats"

Note that input, output and log file paths can be chosen freely. When running with

snakemake --use-conda

the software dependencies will be automatically deployed into an isolated environment before execution.

Authors
  • Julian de Ruiter
Code
"""Snakemake wrapper for trimming paired-end reads using cutadapt."""

__author__ = "Julian de Ruiter"
__copyright__ = "Copyright 2017, Julian de Ruiter"
__email__ = "julianderuiter@gmail.com"
__license__ = "MIT"


from snakemake.shell import shell


extra = snakemake.params.get("extra", "")
region = snakemake.params.get("region", "")
log = snakemake.log_fmt_shell(stdout=False, stderr=True)


shell("samtools stats {extra} {snakemake.input}"
      " {region} > {snakemake.output} {log}")
SAMTOOLS VIEW

Convert or filter SAM/BAM.

Software dependencies
  • samtools ==1.5
Example

This wrapper can be used in the following way:

rule samtools_view:
    input:
        "{sample}.sam"
    output:
        "{sample}.bam"
    params:
        "-b" # optional params string
    wrapper:
        "0.18.0/bio/samtools/view"

Note that input, output and log file paths can be chosen freely. When running with

snakemake --use-conda

the software dependencies will be automatically deployed into an isolated environment before execution.

Authors
  • Johannes Köster
Code
__author__ = "Johannes Köster"
__copyright__ = "Copyright 2016, Johannes Köster"
__email__ = "koester@jimmy.harvard.edu"
__license__ = "MIT"


from snakemake.shell import shell


shell("samtools view {snakemake.params} {snakemake.input[0]} > {snakemake.output[0]}")

SICKLE

Wrappers

SICKLE PE

Trim paired-end reads with sickle.

Software dependencies
  • sickle-trim ==1.33
Example

This wrapper can be used in the following way:

rule sickle_pe:
  input:
    r1="input_R1.fq",
    r2="input_R2.fq"
  output:
    r1="output_R1.fq",
    r2="output_R2.fq",
    rs="output_single.fq",
  params:
    qual_type="sanger",
    # optional extra parameters
    extra=""
  log:
    # optional log file
    "logs/sickle/job.log"
  wrapper:
    "0.18.0/bio/sickle/pe"

Note that input, output and log file paths can be chosen freely. When running with

snakemake --use-conda

the software dependencies will be automatically deployed into an isolated environment before execution.

Authors
  • Wibowo Arindrarto
Code
__author__ = "Wibowo Arindrarto"
__copyright__ = "Copyright 2016, Wibowo Arindrarto"
__email__ = "bow@bow.web.id"
__license__ = "BSD"

from snakemake.shell import shell

# Placeholder for optional parameters
extra = snakemake.params.get("extra", "")
log = snakemake.log_fmt_shell()

shell(
    "(sickle pe -f {snakemake.input.r1} -r {snakemake.input.r2} "
    "-o {snakemake.output.r1} -p {snakemake.output.r2} "
    "-s {snakemake.output.rs} -t {snakemake.params.qual_type} "
    "{extra}) {log}"
)
SICKLE SE

Trim single-end reads with sickle.

Software dependencies
  • sickle-trim ==1.33
Example

This wrapper can be used in the following way:

rule sickle_pe:
  input:
    "input_R1.fq"
  output:
    "output_R1.fq"
  params:
    qual_type="sanger",
    # optional extra parameters
    extra=""
  log:
    "logs/sickle/job.log"
  wrapper:
    "0.18.0/bio/sickle/pe"

Note that input, output and log file paths can be chosen freely. When running with

snakemake --use-conda

the software dependencies will be automatically deployed into an isolated environment before execution.

Authors
  • Wibowo Arindrarto
Code
__author__ = "Wibowo Arindrarto"
__copyright__ = "Copyright 2016, Wibowo Arindrarto"
__email__ = "bow@bow.web.id"
__license__ = "BSD"

from snakemake.shell import shell

# Placeholder for optional parameters
extra = snakemake.params.get("extra", "")
log = snakemake.log_fmt_shell()

shell(
    "(sickle se -f {snakemake.input[0]} -o {snakemake.output[0]} "
    "-t {snakemake.params.qual_type} {extra}) {log}"
)

STAR

Wrappers

STAR

Map reads with STAR.

Software dependencies
  • star ==2.5.3a
Example

This wrapper can be used in the following way:

rule star:
    input:
        sample=["reads/{sample}.1.fastq", "reads/{sample}.2.fastq"]
    output:
        # see STAR manual for additional output files
        "star/{sample}/Aligned.out.bam"
    log:
        "logs/star/{sample}.log"
    params:
        # path to STAR reference genome index
        index="index",
        # optional parameters
        extra=""
    threads: 8
    wrapper:
        "0.18.0/bio/star/align"

Note that input, output and log file paths can be chosen freely. When running with

snakemake --use-conda

the software dependencies will be automatically deployed into an isolated environment before execution.

Authors
  • Johannes Köster
Code
__author__ = "Johannes Köster"
__copyright__ = "Copyright 2016, Johannes Köster"
__email__ = "koester@jimmy.harvard.edu"
__license__ = "MIT"


import os
from snakemake.shell import shell

extra = snakemake.params.get("extra", "")
log = snakemake.log_fmt_shell(stdout=True, stderr=True)

n = len(snakemake.input.sample)
assert n == 1 or n == 2, "input->sample must have 1 (single-end) or 2 (paired-end) elements."

if snakemake.input.sample[0].endswith(".gz"):
    readcmd = "--readFilesCommand zcat"
else:
    readcmd = ""


outprefix = os.path.dirname(snakemake.output[0]) + "/"


shell(
    "STAR "
    "{snakemake.params.extra} "
    "--runThreadN {snakemake.threads} "
    "--genomeDir {snakemake.params.index} "
    "--readFilesIn {snakemake.input.sample} "
    "{readcmd} "
    "--outSAMtype BAM Unsorted "
    "--outFileNamePrefix {outprefix} "
    "--outStd Log "
    "{log}")

TRIMMOMATIC

Wrappers

TRIMMOMATIC PE

Trim paired-end reads with trimmomatic.

Software dependencies
  • trimmomatic ==0.36
Example

This wrapper can be used in the following way:

rule trimmomatic_pe:
    input:
        r1="reads/{sample}.1.fastq",
        r2="reads/{sample}.2.fastq"
    output:
        r1="trimmed/{sample}.1.fastq.gz",
        r2="trimmed/{sample}.2.fastq.gz",
        # reads where trimming entirely removed the mate
        r1_unpaired="trimmed/{sample}.1.unpaired.fastq.gz",
        r2_unpaired="trimmed/{sample}.2.unpaired.fastq.gz"
    log:
        "logs/trimmomatic/{sample}.log"
    params:
        # list of trimmers (see manual)
        trimmer=["TRAILING:3"],
        # optional parameters
        extra=""
    wrapper:
        "0.18.0/bio/trimmomatic/pe"

Note that input, output and log file paths can be chosen freely. When running with

snakemake --use-conda

the software dependencies will be automatically deployed into an isolated environment before execution.

Authors
  • Johannes Köster
Code
__author__ = "Johannes Köster"
__copyright__ = "Copyright 2016, Johannes Köster"
__email__ = "koester@jimmy.harvard.edu"
__license__ = "MIT"


from snakemake.shell import shell

extra = snakemake.params.get("extra", "")
log = snakemake.log_fmt_shell(stdout=True, stderr=True)
trimmer = " ".join(snakemake.params.trimmer)

shell("trimmomatic PE {snakemake.params.extra} "
      "{snakemake.input.r1} {snakemake.input.r2} "
      "{snakemake.output.r1} {snakemake.output.r1_unpaired} "
      "{snakemake.output.r2} {snakemake.output.r2_unpaired} "
      "{trimmer} "
      "{log}")
TRIMMOMATIC SE

Trim single-end reads with trimmomatic.

Software dependencies
  • trimmomatic ==0.36
Example

This wrapper can be used in the following way:

rule trimmomatic_pe:
    input:
        "reads/{sample}.fastq"
    output:
        "trimmed/{sample}.fastq.gz"
    log:
        "logs/trimmomatic/{sample}.log"
    params:
        # list of trimmers (see manual)
        trimmer=["TRAILING:3"],
        # optional parameters
        extra=""
    wrapper:
        "0.18.0/bio/trimmomatic/se"

Note that input, output and log file paths can be chosen freely. When running with

snakemake --use-conda

the software dependencies will be automatically deployed into an isolated environment before execution.

Authors
  • Johannes Köster
Code
__author__ = "Johannes Köster"
__copyright__ = "Copyright 2016, Johannes Köster"
__email__ = "koester@jimmy.harvard.edu"
__license__ = "MIT"


from snakemake.shell import shell

extra = snakemake.params.get("extra", "")
log = snakemake.log_fmt_shell(stdout=True, stderr=True)
trimmer = " ".join(snakemake.params.trimmer)

shell("trimmomatic SE {snakemake.params.extra} "
      "{snakemake.input} {snakemake.output} "
      "{trimmer} "
      "{log}")

VCF

Wrappers

COMPRESS VCF

Compress and index vcf file with bgzip and tabix.

Software dependencies
  • htslib ==1.5
Example

This wrapper can be used in the following way:

rule compress_vcf:
    input:
        "{prefix}.vcf"
    output:
        "{prefix}.vcf.gz"
    wrapper:
        "0.18.0/bio/vcf/compress"

Note that input, output and log file paths can be chosen freely. When running with

snakemake --use-conda

the software dependencies will be automatically deployed into an isolated environment before execution.

Authors
  • Johannes Köster
Code
__author__ = "Johannes Köster"
__copyright__ = "Copyright 2016, Johannes Köster"
__email__ = "koester@jimmy.harvard.edu"
__license__ = "MIT"


from snakemake.shell import shell


shell("bgzip --stdout {snakemake.input} > {snakemake.output} && tabix -p vcf {snakemake.output}")
UNCOMPRESS VCF

Uncompress vcf file with bgzip.

Software dependencies
  • htslib ==1.5
Example

This wrapper can be used in the following way:

rule uncompress_vcf:
    input:
        "{prefix}.vcf.gz"
    output:
        "{prefix}.vcf"
    wrapper:
        "0.18.0/bio/vcf/uncompress"

Note that input, output and log file paths can be chosen freely. When running with

snakemake --use-conda

the software dependencies will be automatically deployed into an isolated environment before execution.

Authors
  • Johannes Köster
Code
__author__ = "Johannes Köster"
__copyright__ = "Copyright 2016, Johannes Köster"
__email__ = "koester@jimmy.harvard.edu"
__license__ = "MIT"


from snakemake.shell import shell


shell("bgzip --decompress --stdout {snakemake.input} > {snakemake.output}")