The Snakemake Wrappers repository¶
The Snakemake Wrapper Repository is a collection of reusable wrappers that allow to quickly use popular tools from Snakemake rules and workflows.
Usage¶
The general strategy is to include a wrapper into your workflow via the wrapper directive, e.g.
rule samtools_sort:
input:
"mapped/{sample}.bam"
output:
"mapped/{sample}.sorted.bam"
params:
"-m 4G"
threads: 8
wrapper:
"0.2.0/bio/samtools/sort"
Here, Snakemake will automatically download the corresponding wrapper from https://bitbucket.org/snakemake/snakemake-wrappers/src/0.2.0/bio/samtools/sort/wrapper.py. Thereby, 0.2.0 can be replaced with the version tag you want to use, or a commit id (see here). This ensures reproducibility since changes in the wrapper implementation won’t be propagated automatically to your workflow. Alternatively, e.g., for development, the wrapper directive can also point to full URLs, including the local file://
.
Each wrapper defines required software packages and versions. In combination with the --use-conda
flag of Snakemake, these will be deployed automatically.
Contribute¶
We invite anybody to contribute to the Snakemake Wrapper Repository. If you want to contribute we suggest the following procedure:
- fork the repository
- develop your contribution
- perform a pull request
The pull request will be reviewed and included as fast as possible. Thereby, contributions should follow the coding style of the already present examples, i.e.
- provide a meta.yaml with name, description and author of the wrapper,
- provide an environment.yaml which lists all required software packages (the packages shall be available via https://anaconda.org),
- provide an example Snakefile that shows how to use the wrapper,
- follow the python style guide,
- use 4 spaces for indentation.
BCFTOOLS¶
Wrappers¶
BCFTOOLS CALL¶
Call variants with bcftools.
Software dependencies¶
- samtools ==1.5
- bcftools ==1.5
Example¶
This wrapper can be used in the following way:
rule bcftools_call:
input:
ref="genome.fasta",
samples=expand("mapped/{sample}.sorted.bam", sample=config["samples"]),
indexes=expand("mapped/{sample}.sorted.bam.bai", sample=config["samples"])
output:
# Here, we optionally use a region as wildcard and constrain it to the
# format accepted by samtools mpileup.
"called/{region,.+(:[0-9]+-[0-9]+)?}.bcf"
params:
# Optional parameters for samtools mpileup (except -g, -f).
# In this example, we forward the region wildcard from the output file to mpileup.
mpileup="--region {region}",
# Optional parameters for bcftools call (except -v, -o, -m).
call=""
log:
"logs/bcftools_call/{region}.log"
wrapper:
"0.19.2/bio/bcftools/call"
Note that input, output and log file paths can be chosen freely. When running with
snakemake --use-conda
the software dependencies will be automatically deployed into an isolated environment before execution.
Authors¶
- Johannes Köster
Code¶
__author__ = "Johannes Köster"
__copyright__ = "Copyright 2016, Johannes Köster"
__email__ = "koester@jimmy.harvard.edu"
__license__ = "MIT"
from snakemake.shell import shell
shell(
"(samtools mpileup {snakemake.params.mpileup} {snakemake.input.samples} "
"--fasta-ref {snakemake.input.ref} --BCF --uncompressed | "
"bcftools call -m {snakemake.params.call} -o {snakemake.output[0]} -v -) 2> {snakemake.log}")
BCFTOOLS CONCAT¶
Concatenate vcf/bcf files with bcftools.
Software dependencies¶
- bcftools ==1.6
Example¶
This wrapper can be used in the following way:
rule bcftools_concat:
input:
calls=["a.bcf", "b.bcf"]
output:
"all.bcf"
params:
"" # optional parameters for bcftools concat (except -o)
wrapper:
"0.19.2/bio/bcftools/concat"
Note that input, output and log file paths can be chosen freely. When running with
snakemake --use-conda
the software dependencies will be automatically deployed into an isolated environment before execution.
Authors¶
- Johannes Köster
Code¶
__author__ = "Johannes Köster"
__copyright__ = "Copyright 2016, Johannes Köster"
__email__ = "koester@jimmy.harvard.edu"
__license__ = "MIT"
from snakemake.shell import shell
shell(
"bcftools concat {snakemake.params} -o {snakemake.output[0]} "
"{snakemake.input.calls}")
BCFTOOLS VIEW¶
View vcf/bcf file in a different format.
Software dependencies¶
- bcftools ==1.5
Example¶
This wrapper can be used in the following way:
rule bcf_to_vcf:
input:
"{prefix}.bcf"
output:
"{prefix}.vcf"
params:
"" # optional parameters for bcftools view (except -o)
wrapper:
"0.19.2/bio/bcftools/view"
Note that input, output and log file paths can be chosen freely. When running with
snakemake --use-conda
the software dependencies will be automatically deployed into an isolated environment before execution.
Authors¶
- Johannes Köster
Code¶
__author__ = "Johannes Köster"
__copyright__ = "Copyright 2016, Johannes Köster"
__email__ = "koester@jimmy.harvard.edu"
__license__ = "MIT"
from snakemake.shell import shell
shell(
"bcftools view {snakemake.params} {snakemake.input[0]} "
"-o {snakemake.output[0]}")
BOWTIE2¶
Wrappers¶
BOWTIE2¶
Map reads with bowtie2.
Software dependencies¶
- bowtie2 ==2.3.2
- samtools ==1.5
Example¶
This wrapper can be used in the following way:
rule bowtie2:
input:
sample=["reads/{sample}.1.fastq", "reads/{sample}.2.fastq"]
output:
"mapped/{sample}.bam"
log:
"logs/bowtie2/{sample}.log"
params:
index="index/genome", # prefix of reference genome index (built with bowtie2-build)
extra="" # optional parameters
threads: 8
wrapper:
"0.19.2/bio/bowtie2/align"
Note that input, output and log file paths can be chosen freely. When running with
snakemake --use-conda
the software dependencies will be automatically deployed into an isolated environment before execution.
Authors¶
- Johannes Köster
Code¶
__author__ = "Johannes Köster"
__copyright__ = "Copyright 2016, Johannes Köster"
__email__ = "koester@jimmy.harvard.edu"
__license__ = "MIT"
from snakemake.shell import shell
extra = snakemake.params.get("extra", "")
log = snakemake.log_fmt_shell(stdout=True, stderr=True)
n = len(snakemake.input.sample)
assert n == 1 or n == 2, "input->sample must have 1 (single-end) or 2 (paired-end) elements."
if n == 1:
reads = "-U {}".format(*snakemake.input.sample)
else:
reads = "-1 {} -2 {}".format(*snakemake.input.sample)
shell(
"(bowtie2 --threads {snakemake.threads} {snakemake.params.extra} "
"-x {snakemake.params.index} {reads} "
"| samtools view -Sbh -o {snakemake.output[0]} -) {log}")
BWA¶
Wrappers¶
BWA ALN¶
Map reads with bwa aln.
Software dependencies¶
- bwa ==0.7.15
Example¶
This wrapper can be used in the following way:
rule bwa_aln:
input:
"reads/{sample}.{pair}.fastq"
output:
"sai/{sample}.{pair}.sai"
params:
index="genome",
extra=""
log:
"logs/bwa_aln/{sample}.{pair}.log"
threads: 8
wrapper:
"0.19.2/bio/bwa/aln"
Note that input, output and log file paths can be chosen freely. When running with
snakemake --use-conda
the software dependencies will be automatically deployed into an isolated environment before execution.
Authors¶
- Julian de Ruiter
Code¶
"""Snakemake wrapper for bwa aln."""
__author__ = "Julian de Ruiter"
__copyright__ = "Copyright 2017, Julian de Ruiter"
__email__ = "julianderuiter@gmail.com"
__license__ = "MIT"
from snakemake.shell import shell
extra = snakemake.params.get('extra', '')
log = snakemake.log_fmt_shell(stdout=False, stderr=True)
shell(
"bwa aln"
" {extra}"
" -t {snakemake.threads}"
" {snakemake.params.index}"
" {snakemake.input[0]}"
" > {snakemake.output[0]} {log}")
BWA INDEX¶
Creates a BWA index.
Software dependencies¶
- bwa ==0.7.15
Example¶
This wrapper can be used in the following way:
rule bwa_index:
input:
"{genome}.fasta"
output:
"{genome}.amb",
"{genome}.ann",
"{genome}.bwt",
"{genome}.pac",
"{genome}.sa"
log:
"logs/bwa_index/{genome}.log"
params:
prefix="{genome}",
algorithm="bwtsw"
wrapper:
"0.19.2/bio/bwa/index"
Note that input, output and log file paths can be chosen freely. When running with
snakemake --use-conda
the software dependencies will be automatically deployed into an isolated environment before execution.
Authors¶
- Patrik Smeds
Code¶
__author__ = "Patrik Smeds"
__copyright__ = "Copyright 2016, Patrik Smeds"
__email__ = "patrik.smeds@gmail.com"
__license__ = "MIT"
from os import path
from snakemake.shell import shell
log = snakemake.log_fmt_shell(stdout=False, stderr=True)
#Check inputs/arguments.
if len(snakemake.input) == 0:
raise ValueError("A reference genome has to be provided!")
elif len(snakemake.input) > 1:
raise ValueError("Only one reference genome can be inputed!")
#Prefix that should be used for the database
prefix = snakemake.params.get("prefix", "")
if len(prefix) > 0:
prefix = "-p " + prefix
#Contrunction algorithm that will be used to build the database, default is bwtsw
construction_algorithm = snakemake.params.get("algorithm", "")
if len(construction_algorithm) != 0:
construction_algorithm = "-a " + construction_algorithm
shell(
"bwa index"
" {prefix}"
" {construction_algorithm}"
" {snakemake.input[0]}"
" {log}")
BWA MEM¶
Map reads using bwa mem, with optional sorting using samtools or picard.
Software dependencies¶
- bwa ==0.7.15
- samtools ==1.5
- picard ==2.9.2
Example¶
This wrapper can be used in the following way:
rule bwa_mem:
input:
["reads/{sample}.1.fastq", "reads/{sample}.2.fastq"]
output:
"mapped/{sample}.bam"
log:
"logs/bwa_mem/{sample}.log"
params:
index="genome",
extra=r"-R '@RG\tID:{sample}\tSM:{sample}'",
sort="none", # Can be 'none', 'samtools' or 'picard'.
sort_order="queryname", # Can be 'queryname' or 'coordinate'.
sort_extra="" # Extra args for samtools/picard.
threads: 8
wrapper:
"0.19.2/bio/bwa/mem"
Note that input, output and log file paths can be chosen freely. When running with
snakemake --use-conda
the software dependencies will be automatically deployed into an isolated environment before execution.
Authors¶
- Johannes Köster
- Julian de Ruiter
Code¶
__author__ = "Johannes Köster, Julian de Ruiter"
__copyright__ = "Copyright 2016, Johannes Köster and Julian de Ruiter"
__email__ = "koester@jimmy.harvard.edu, julianderuiter@gmail.com"
__license__ = "MIT"
from os import path
from snakemake.shell import shell
# Extract arguments.
extra = snakemake.params.get("extra", "")
sort = snakemake.params.get("sort", "none")
sort_order = snakemake.params.get("sort_order", "coordinate")
sort_extra = snakemake.params.get("sort_extra", "")
log = snakemake.log_fmt_shell(stdout=False, stderr=True)
# Check inputs/arguments.
if len(snakemake.input) not in {1, 2}:
raise ValueError("input must have 1 (single-end) or "
"2 (paired-end) elements")
if sort_order not in {"coordinate", "queryname"}:
raise ValueError("Unexpected value for sort_order ({})".format(sort_order))
# Determine which pipe command to use for converting to bam or sorting.
if sort == "none":
# Simply convert to bam using samtools view.
pipe_cmd = "samtools view -Sbh -o {snakemake.output[0]} -"
elif sort == "samtools":
# Sort alignments using samtools sort.
pipe_cmd = "samtools sort {sort_extra} -o {snakemake.output[0]} -"
# Add name flag if needed.
if sort_order == "queryname":
sort_extra += " -n"
prefix = path.splitext(snakemake.output[0])[0]
sort_extra += " -T " + prefix + ".tmp"
elif sort == "picard":
# Sort alignments using picard SortSam.
pipe_cmd = ("picard SortSam {sort_extra} INPUT=/dev/stdin"
" OUTPUT={snakemake.output[0]} SORT_ORDER={sort_order}")
else:
raise ValueError("Unexpected value for params.sort ({})".format(sort))
shell(
"(bwa mem"
" -t {snakemake.threads}"
" {extra}"
" {snakemake.params.index}"
" {snakemake.input}"
" | " + pipe_cmd + ") {log}")
BWA SAMPE¶
Map paired-end reads with bwa sampe.
Software dependencies¶
- bwa ==0.7.15
- samtools ==1.3
- picard ==2.9.2
Example¶
This wrapper can be used in the following way:
rule bwa_sampe:
input:
fastq=["reads/{sample}.1.fastq", "reads/{sample}.2.fastq"],
sai=["sai/{sample}.1.sai", "sai/{sample}.2.sai"]
output:
"mapped/{sample}.bam"
params:
index="genome",
extra=r"-r '@RG\tID:{sample}\tSM:{sample}'", # optional: Extra parameters for bwa.
sort="none", # optional: Enable sorting. Possible values: 'none', 'samtools' or 'picard'`
sort_order="queryname", # optional: Sort by 'queryname' or 'coordinate'
sort_extra="" # optional: extra arguments for samtools/picard
log:
"logs/bwa_sampe/{sample}.log"
wrapper:
"0.19.2/bio/bwa/sampe"
Note that input, output and log file paths can be chosen freely. When running with
snakemake --use-conda
the software dependencies will be automatically deployed into an isolated environment before execution.
Authors¶
- Julian de Ruiter
Code¶
"""Snakemake wrapper for bwa sampe."""
__author__ = "Julian de Ruiter"
__copyright__ = "Copyright 2017, Julian de Ruiter"
__email__ = "julianderuiter@gmail.com"
__license__ = "MIT"
from os import path
from snakemake.shell import shell
# Check inputs.
if not len(snakemake.input.sai) == 2:
raise ValueError('input.sai must have 2 elements')
if not len(snakemake.input.fastq) == 2:
raise ValueError('input.fastq must have 2 elements')
# Extract arguments.
extra = snakemake.params.get("extra", "")
sort = snakemake.params.get("sort", "none")
sort_order = snakemake.params.get("sort_order", "coordinate")
sort_extra = snakemake.params.get("sort_extra", "")
log = snakemake.log_fmt_shell(stdout=False, stderr=True)
# Determine which pipe command to use for converting to bam or sorting.
if sort == "none":
# Simply convert to bam using samtools view.
pipe_cmd = "samtools view -Sbh -o {snakemake.output[0]} -"
elif sort == "samtools":
# Sort alignments using samtools sort.
pipe_cmd = "samtools sort {sort_extra} -o {snakemake.output[0]} -"
# Add name flag if needed.
if sort_order == "queryname":
sort_extra += " -n"
# Use prefix for temp.
prefix = path.splitext(snakemake.output[0])[0]
sort_extra += " -T " + prefix + ".tmp"
elif sort == "picard":
# Sort alignments using picard SortSam.
pipe_cmd = ("picard SortSam {sort_extra} INPUT=/dev/stdin"
" OUTPUT={snakemake.output[0]} SORT_ORDER={sort_order}")
else:
raise ValueError("Unexpected value for params.sort ({})".format(sort))
# Run command.
shell(
"(bwa sampe"
" {extra}"
" {snakemake.params.index}"
" {snakemake.input.sai}"
" {snakemake.input.fastq}"
" | " + pipe_cmd + ") {log}")
BWA SAMSE¶
Map single-end reads with bwa samse.
Software dependencies¶
- bwa ==0.7.15
- samtools ==1.3
- picard ==2.9.2
Example¶
This wrapper can be used in the following way:
rule bwa_samse:
input:
fastq="reads/{sample}.1.fastq",
sai="sai/{sample}.1.sai"
output:
"mapped/{sample}.bam"
params:
index="genome",
extra=r"-r '@RG\tID:{sample}\tSM:{sample}'", # optional: Extra parameters for bwa.
sort="none", # optional: Enable sorting. Possible values: 'none', 'samtools' or 'picard'`
sort_order="queryname", # optional: Sort by 'queryname' or 'coordinate'
sort_extra="" # optional: extra arguments for samtools/picard
log:
"logs/bwa_samse/{sample}.log"
wrapper:
"0.19.2/bio/bwa/samse"
Note that input, output and log file paths can be chosen freely. When running with
snakemake --use-conda
the software dependencies will be automatically deployed into an isolated environment before execution.
Authors¶
- Julian de Ruiter
Code¶
"""Snakemake wrapper for bwa sampe."""
__author__ = "Julian de Ruiter"
__copyright__ = "Copyright 2017, Julian de Ruiter"
__email__ = "julianderuiter@gmail.com"
__license__ = "MIT"
from os import path
from snakemake.shell import shell
# Extract arguments.
extra = snakemake.params.get("extra", "")
sort = snakemake.params.get("sort", "none")
sort_order = snakemake.params.get("sort_order", "coordinate")
sort_extra = snakemake.params.get("sort_extra", "")
log = snakemake.log_fmt_shell(stdout=False, stderr=True)
# Determine which pipe command to use for converting to bam or sorting.
if sort == "none":
# Simply convert to bam using samtools view.
pipe_cmd = "samtools view -Sbh -o {snakemake.output[0]} -"
elif sort == "samtools":
# Sort alignments using samtools sort.
pipe_cmd = "samtools sort {sort_extra} -o {snakemake.output[0]} -"
# Add name flag if needed.
if sort_order == "queryname":
sort_extra += " -n"
# Use prefix for temp.
prefix = path.splitext(snakemake.output[0])[0]
sort_extra += " -T " + prefix + ".tmp"
elif sort == "picard":
# Sort alignments using picard SortSam.
pipe_cmd = ("picard SortSam {sort_extra} INPUT=/dev/stdin"
" OUTPUT={snakemake.output[0]} SORT_ORDER={sort_order}")
else:
raise ValueError("Unexpected value for params.sort ({})".format(sort))
# Run command.
shell(
"(bwa samse"
" {extra}"
" {snakemake.params.index}"
" {snakemake.input.sai}"
" {snakemake.input.fastq}"
" | " + pipe_cmd + ") {log}")
CUTADAPT¶
Wrappers¶
CUTADAPT-PE¶
Trim paired-end reads using cutadapt.
Software dependencies¶
- cutadapt ==1.13
Example¶
This wrapper can be used in the following way:
rule cutadapt:
input:
["reads/{sample}.1.fastq", "reads/{sample}.2.fastq"]
output:
fastq1="trimmed/{sample}.1.fastq",
fastq2="trimmed/{sample}.2.fastq",
qc="trimmed/{sample}.qc.txt"
params:
"-a AGATCGGAAGAGCACACGTCTGAACTCCAGTCAC -q 20"
log:
"logs/cutadapt/{sample}.log"
wrapper:
"0.19.2/bio/cutadapt/pe"
Note that input, output and log file paths can be chosen freely. When running with
snakemake --use-conda
the software dependencies will be automatically deployed into an isolated environment before execution.
Authors¶
- Julian de Ruiter
Code¶
"""Snakemake wrapper for trimming paired-end reads using cutadapt."""
__author__ = "Julian de Ruiter"
__copyright__ = "Copyright 2017, Julian de Ruiter"
__email__ = "julianderuiter@gmail.com"
__license__ = "MIT"
from snakemake.shell import shell
n = len(snakemake.input)
assert n == 2, "Input must contain 2 (paired-end) elements."
log = snakemake.log_fmt_shell(stdout=False, stderr=True)
shell(
"cutadapt"
" {snakemake.params}"
" -o {snakemake.output.fastq1}"
" -p {snakemake.output.fastq2}"
" {snakemake.input}"
" > {snakemake.output.qc} {log}")
CUTADAPT-SE¶
Trim single-end reads using cutadapt.
Software dependencies¶
- cutadapt ==1.13
Example¶
This wrapper can be used in the following way:
rule cutadapt:
input:
"reads/{sample}.fastq"
output:
fastq="trimmed/{sample}.fastq",
qc="trimmed/{sample}.qc.txt"
params:
"-a AGATCGGAAGAGCACACGTCTGAACTCCAGTCAC -q 20"
log:
"logs/cutadapt/{sample}.log"
wrapper:
"0.19.2/bio/cutadapt/se"
Note that input, output and log file paths can be chosen freely. When running with
snakemake --use-conda
the software dependencies will be automatically deployed into an isolated environment before execution.
Authors¶
- Julian de Ruiter
Code¶
"""Snakemake wrapper for trimming paired-end reads using cutadapt."""
__author__ = "Julian de Ruiter"
__copyright__ = "Copyright 2017, Julian de Ruiter"
__email__ = "julianderuiter@gmail.com"
__license__ = "MIT"
from snakemake.shell import shell
log = snakemake.log_fmt_shell(stdout=False, stderr=True)
shell(
"cutadapt"
" {snakemake.params}"
" -o {snakemake.output.fastq}"
" {snakemake.input[0]}"
" > {snakemake.output.qc} {log}")
DELLY¶
Call variants with delly.
Software dependencies¶
- delly ==0.7.7
Example¶
This wrapper can be used in the following way:
rule delly:
input:
ref="genome.fasta",
samples=["mapped/a.bam"],
# optional exclude template (see https://github.com/dellytools/delly)
exclude="human.hg19.excl.tsv"
output:
"sv/{type,(DEL|DUP|INV|TRA|INS)}.bcf"
params:
vartype="{type}", # variant type to call (can be wildcard, hardcoded string or function)
extra="" # optional parameters for delly (except -t, -g)
log:
"logs/delly/{type}.log"
threads: 3
wrapper:
"0.19.2/bio/delly"
Note that input, output and log file paths can be chosen freely. When running with
snakemake --use-conda
the software dependencies will be automatically deployed into an isolated environment before execution.
Authors¶
- Johannes Köster
Code¶
__author__ = "Johannes Köster"
__copyright__ = "Copyright 2016, Johannes Köster"
__email__ = "koester@jimmy.harvard.edu"
__license__ = "MIT"
from snakemake.shell import shell
try:
exclude = "-x " + snakemake.input.exclude
except AttributeError:
exclude = ""
extra = snakemake.params.get("extra", "")
log = snakemake.log_fmt_shell(stdout=True, stderr=True)
shell(
"OMP_NUM_THREADS={snakemake.threads} delly call {extra} "
"{exclude} -t {snakemake.params.vartype} -g {snakemake.input.ref} "
"-o {snakemake.output[0]} {snakemake.input.samples} {log}")
EPIC¶
Wrappers¶
EPIC¶
Find enriched domains in ChIP-Seq data with epic
Software dependencies¶
- epic>=0.2.5
Example¶
This wrapper can be used in the following way:
rule epic:
input:
treatment = "bed/test.bed",
background = "bed/control.bed"
output:
enriched_regions = "epic/enriched_regions.csv", # required
bed = "epic/enriched_regions.bed", # optional
matrix = "epic/matrix.gz" # optional
log:
"logs/epic/epic.log"
params:
genome = "hg19", # optional, default hg19
extra="-g 3 -w 200" # "--bigwig epic/bigwigs"
threads: 1 # optional, defaults to 1
wrapper:
"0.19.2/bio/epic/peaks"
Note that input, output and log file paths can be chosen freely. When running with
snakemake --use-conda
the software dependencies will be automatically deployed into an isolated environment before execution.
Notes¶
- All/any of the different bigwig options must be given as extra parameters
Authors¶
- Endre Bakken Stovner
Code¶
__author__ = "Endre Bakken Stovner"
__copyright__ = "Copyright 2017, Endre Bakken Stovner"
__email__ = "endrebak85@gmail.com"
__license__ = "MIT"
from snakemake.shell import shell
# Placeholder for optional parameters
extra = snakemake.params.get("extra", "")
threads = snakemake.threads or 1
treatment = snakemake.input.get("treatment")
background = snakemake.input.get("background")
# Executed shell command
enriched_regions = snakemake.output.get("enriched_regions")
bed = snakemake.output.get("bed")
matrix = snakemake.output.get("matrix")
if len(snakemake.log) > 0:
log = snakemake.log[0]
genome = snakemake.params.get("genome")
cmd = "epic -cpu {threads} -t {treatment} -c {background} -o {enriched_regions} -gn {genome}"
if bed:
cmd += " -b {bed}"
if matrix:
cmd += " -sm {matrix}"
if log:
cmd += " -l {log}"
cmd += " {extra}"
shell(cmd)
FASTQ_SCREEN¶
fastq_screen screens a library of sequences in FASTQ format against a set of sequence databases so you can see if the composition of the library matches with what you expect.
This wrapper allows the configuration to be passed as a filename or as a dictionary in the rule’s params.fastq_screen_config of the rule. So the following configuration file:
DATABASE ecoli /data/Escherichia_coli/Bowtie2Index/genome BOWTIE2
DATABASE ecoli /data/Escherichia_coli/Bowtie2Index/genome BOWTIE
DATABASE hg19 /data/hg19/Bowtie2Index/genome BOWTIE2
DATABASE mm10 /data/mm10/Bowtie2Index/genome BOWTIE2
BOWTIE /path/to/bowtie
BOWTIE2 /path/to/bowtie2
becomes:
fastq_screen_config = {
'database': {
'ecoli': {
'bowtie2': '/data/Escherichia_coli/Bowtie2Index/genome',
'bowtie': '/data/Escherichia_coli/BowtieIndex/genome'},
'hg19': {
'bowtie2': '/data/hg19/Bowtie2Index/genome'},
'mm10': {
'bowtie2': '/data/mm10/Bowtie2Index/genome'}
},
'aligner_paths': {'bowtie': 'bowtie', 'bowtie2': 'bowtie2'}
}
By default, the wrapper will use bowtie2 as the aligner and a subset of 100000
reads. These can be overridden using params.aligner
and params.subset
respectively. Furthermore, params.extra can be used to pass additional
arguments verbatim to fastq_screen
, for example extra="--illumina1_3"
or
extra="--bowtie2 '--trim5=8'"
.
Software dependencies¶
- fastq-screen ==0.5.2
- bowtie2 ==2.2.6
- bowtie ==1.1.2
Example¶
This wrapper can be used in the following way:
rule fastq_screen:
input:
"samples/{sample}.fastq.gz"
output:
txt="qc/{sample}.fastq_screen.txt",
png="qc/{sample}.fastq_screen.png"
params:
fastq_screen_config=fastq_screen_config,
subset=100000,
aligner='bowtie2'
threads: 8
wrapper:
"0.19.2/bio/fastq_screen"
Note that input, output and log file paths can be chosen freely. When running with
snakemake --use-conda
the software dependencies will be automatically deployed into an isolated environment before execution.
Notes¶
fastq_screen
hard-codes the output filenames. This wrapper moves the hard-coded output files to those specified by the rule.- While the dictionary form of
fastq_screen_config
is convenient, the unordered nature of the dictionary may causesnakemake --list-params-changed
to incorrectly report changed parameters even though the contents remain the same. If you plan on using--list-params-changed
then it will be better to write a config file and pass that as fastq_screen_config. This problem will disappear with Python 3.6. - When providing the dictionary form of
fastq_screen_config
, the wrapper will write a temp file using Python’stempfile
module. To control the temp file directory, make sure the $TMPDIR environmental variable is set (see the tempfile docs) for details). One way of doing this is by adding something likeshell.prefix("export TMPDIR=/scratch; ")
to the snakefile calling this wrapper.
Authors¶
- Ryan Dale
Code¶
import os
from snakemake.shell import shell
import tempfile
__author__ = "Ryan Dale"
__copyright__ = "Copyright 2016, Ryan Dale"
__email__ = "dalerr@niddk.nih.gov"
__license__ = "MIT"
_config = snakemake.params['fastq_screen_config']
subset = snakemake.params.get('subset', 100000)
aligner = snakemake.params.get('aligner', 'bowtie2')
extra = snakemake.params.get('extra', '')
log = snakemake.log_fmt_shell()
# snakemake.params.fastq_screen_config can be either a dict or a string. If
# string, interpret as a filename pointing to the fastq_screen config file.
# Otherwise, create a new tempfile out of the contents of the dict:
if isinstance(_config, dict):
tmp = tempfile.NamedTemporaryFile(delete=False).name
with open(tmp, 'w') as fout:
for label, indexes in _config['database'].items():
for aligner, index in indexes.items():
fout.write('\t'.join([
'DATABASE', label, index, aligner.upper()]) + '\n')
for aligner, path in _config['aligner_paths'].items():
fout.write('\t'.join([aligner.upper(), path]) + '\n')
config_file = tmp
else:
config_file = _config
# fastq_screen hard-codes filenames according to this prefix. We will send
# hard-coded output to a temp dir, and then move them later.
prefix = os.path.basename(snakemake.input[0].split('.fastq')[0])
tempdir = tempfile.mkdtemp()
shell(
"fastq_screen --outdir {tempdir} "
"--force "
"--aligner {aligner} "
"--conf {config_file} "
"--subset {subset} "
"--threads {snakemake.threads} "
"{extra} "
"{snakemake.input[0]} "
"{log}"
)
# Move output to the filenames specified by the rule
shell("mv {tempdir}/{prefix}_screen.txt {snakemake.output.txt}")
shell("mv {tempdir}/{prefix}_screen.png {snakemake.output.png}")
# Clean up temp
shell("rm -r {tempdir}")
if isinstance(_config, dict):
shell("rm {tmp}")
FASTQC¶
Generate fastq qc statistics using fastqc.
Software dependencies¶
- fontconfig ==2.12.1
- fastqc ==0.11.5
Example¶
This wrapper can be used in the following way:
rule fastqc:
input:
"reads/{sample}.fastq"
output:
html="qc/{sample}.html",
zip="qc/{sample}.zip"
params: ""
wrapper:
"0.19.2/bio/fastqc"
Note that input, output and log file paths can be chosen freely. When running with
snakemake --use-conda
the software dependencies will be automatically deployed into an isolated environment before execution.
Authors¶
- Julian de Ruiter
Code¶
"""Snakemake wrapper for fastqc."""
__author__ = "Julian de Ruiter"
__copyright__ = "Copyright 2017, Julian de Ruiter"
__email__ = "julianderuiter@gmail.com"
__license__ = "MIT"
from os import path
from snakemake.shell import shell
def basename_without_ext(file_path):
"""Returns basename of file path, without the file extension."""
base = path.basename(file_path)
split_ind = 2 if base.endswith(".gz") else 1
base = ".".join(base.split(".")[:-split_ind])
return base
# Run fastqc.
output_dir = path.dirname(snakemake.output.html)
shell("fastqc {snakemake.params} --quiet "
"--outdir {output_dir} {snakemake.input[0]}")
# Move outputs into proper position.
output_base = basename_without_ext(snakemake.input[0])
html_path = path.join(output_dir, output_base + "_fastqc.html")
zip_path = path.join(output_dir, output_base + "_fastqc.zip")
if snakemake.output.html != html_path:
shell("mv {html_path} {snakemake.output.html}")
if snakemake.output.zip != zip_path:
shell("mv {zip_path} {snakemake.output.zip}")
FREEBAYES¶
Call small genomic variants with freebayes.
Software dependencies¶
- freebayes ==1.1.0
- bcftools ==1.5
- parallel ==20170422
Example¶
This wrapper can be used in the following way:
rule freebayes:
input:
ref="genome.fasta",
# you can have a list of samples here
samples="mapped/{sample}.bam"
output:
"calls/{sample}.vcf" # either .vcf or .bcf
log:
"logs/freebayes/{sample}.log"
params:
extra="", # optional parameters
chunksize=100000 # reference genome chunk size for parallelization (default: 100000)
threads: 2
wrapper:
"0.19.2/bio/freebayes"
Note that input, output and log file paths can be chosen freely. When running with
snakemake --use-conda
the software dependencies will be automatically deployed into an isolated environment before execution.
Authors¶
- Johannes Köster
Code¶
__author__ = "Johannes Köster"
__copyright__ = "Copyright 2017, Johannes Köster"
__email__ = "johannes.koester@protonmail.com"
__license__ = "MIT"
from snakemake.shell import shell
log = snakemake.log_fmt_shell(stdout=False, stderr=True)
params = snakemake.params.get("extra", "")
pipe = ""
if snakemake.output[0].endswith(".bcf"):
pipe = "| bcftools view -Ob -"
if snakemake.threads == 1:
freebayes = "freebayes"
else:
chunksize = snakemake.params.get("chunksize", 100000)
freebayes = ("freebayes-parallel <(fasta_generate_regions.py "
"{snakemake.input.ref}.fai {chunksize}) "
"{snakemake.threads}").format(snakemake=snakemake,
chunksize=chunksize)
shell("({freebayes} {params} -f {snakemake.input.ref}"
" {snakemake.input.samples} {pipe} > {snakemake.output[0]}) {log}")
HISAT2¶
Map reads with hisat2.
Software dependencies¶
- hisat2 ==2.1.0
- samtools ==1.5
Example¶
This wrapper can be used in the following way:
rule hisat2:
input:
reads=["reads/{sample}.1.fastq.gz", "reads/{sample}.2.fastq.gz"],
output:
"mapped/{sample}.bam"
log: # optional
"logs/hisat2/{sample}.log"
params: # idx is required, extra is optional
idx="genome.fa",
extra="--min-intronlen 1000"
threads: 8 # optional, defaults to 1
wrapper:
"0.19.2/bio/hisat2"
Note that input, output and log file paths can be chosen freely. When running with
snakemake --use-conda
the software dependencies will be automatically deployed into an isolated environment before execution.
Notes¶
- The -S flag must not be used since output is already directly piped to samtools for compression.
- The –threads/-p flag must not be used since threads is set separately via the snakemake threads directive.
- The wrapper does not yet handle SRA input accessions.
- No reference index files checking is done since the actual number of files may differ depending on the reference sequence size. This is also why the index is supplied in the params directive instead of the input directive.
Authors¶
- Wibowo Arindrarto
Code¶
__author__ = "Wibowo Arindrarto"
__copyright__ = "Copyright 2016, Wibowo Arindrarto"
__email__ = "bow@bow.web.id"
__license__ = "BSD"
from snakemake.shell import shell
# Placeholder for optional parameters
extra = snakemake.params.get("extra", "")
# Run log
log = snakemake.log_fmt_shell()
# Input file wrangling
reads = snakemake.input.get("reads")
if isinstance(reads, str):
input_flags = "-U {0}".format(reads)
elif len(reads) == 1:
input_flags = "-U {0}".format(reads[0])
elif len(reads) == 2:
input_flags = "-1 {0} -2 {1}".format(*reads)
else:
raise RuntimeError(
"Reads parameter must contain at least 1 and at most 2"
" input files.")
# Executed shell command
shell(
"(hisat2 {extra} --threads {snakemake.threads}"
" -x {snakemake.params.idx} {input_flags}"
" | samtools view -Sbh -o {snakemake.output[0]} -)"
" {log}")
MULTIQC¶
Generate qc report using multiqc.
Software dependencies¶
- multiqc ==1.2
- networkx <2.0
Example¶
This wrapper can be used in the following way:
rule multiqc:
input:
expand("samtools_stats/{sample}.txt", sample=["a", "b"])
output:
"qc/multiqc.html"
params:
"" # Optional: extra parameters for multiqc.
log:
"logs/multiqc.log"
wrapper:
"0.19.2/bio/multiqc"
Note that input, output and log file paths can be chosen freely. When running with
snakemake --use-conda
the software dependencies will be automatically deployed into an isolated environment before execution.
Authors¶
- Julian de Ruiter
Code¶
"""Snakemake wrapper for trimming paired-end reads using cutadapt."""
__author__ = "Julian de Ruiter"
__copyright__ = "Copyright 2017, Julian de Ruiter"
__email__ = "julianderuiter@gmail.com"
__license__ = "MIT"
from os import path
from snakemake.shell import shell
input_dirs = set(path.dirname(fp) for fp in snakemake.input)
output_dir = path.dirname(snakemake.output[0])
output_name = path.basename(snakemake.output[0])
log = snakemake.log_fmt_shell(stdout=True, stderr=True)
shell(
"multiqc"
" {snakemake.params}"
" --force"
" -o {output_dir}"
" -n {output_name}"
" {input_dirs}"
" {log}")
NGS-DISAMBIGUATE¶
Disambiguation algorithm for reads aligned to two species (e.g. human and mouse genomes) from Tophat, Hisat2, STAR or BWA mem.
Software dependencies¶
- ngs-disambiguate ==2016.11.10
Example¶
This wrapper can be used in the following way:
rule disambiguate:
input:
a="mapped/{sample}.a.bam",
b="mapped/{sample}.b.bam"
output:
a_ambiguous='disambiguate/{sample}.graft.ambiguous.bam',
b_ambiguous='disambiguate/{sample}.host.ambiguous.bam',
a_disambiguated='disambiguate/{sample}.graft.bam',
b_disambiguated='disambiguate/{sample}.host.bam',
summary='qc/disambiguate/{sample}.txt'
params:
algorithm="bwa",
# optional: Prefix to use for output. If omitted, a
# suitable value is guessed from the output paths. Prefix
# is used for the intermediate output paths, as well as
# sample name in summary file.
prefix="{sample}",
extra=""
wrapper:
"0.19.2/bio/ngs-disambiguate"
Note that input, output and log file paths can be chosen freely. When running with
snakemake --use-conda
the software dependencies will be automatically deployed into an isolated environment before execution.
Authors¶
- Julian de Ruiter
Code¶
"""Snakemake wrapper for ngs-disambiguate (from Astrazeneca)."""
__author__ = "Julian de Ruiter"
__copyright__ = "Copyright 2017, Julian de Ruiter"
__email__ = "julianderuiter@gmail.com"
__license__ = "MIT"
from os import path
from snakemake.shell import shell
# Extract arguments.
prefix = snakemake.params.get("prefix", None)
extra = snakemake.params.get("extra", "")
output_dir = path.dirname(snakemake.output.a_ambiguous)
log = snakemake.log_fmt_shell(stdout=False, stderr=True)
# If prefix is not given, we use the summary path to derive the most
# probable sample name (as the summary path is least likely to contain)
# additional suffixes. This is better than using a random id as prefix,
# the prefix is also used as the sample name in the summary file.
if prefix is None:
prefix = path.splitext(path.basename(snakemake.output.summary))[0]
# Run command.
shell(
"ngs_disambiguate"
" {extra}"
" -o {output_dir}"
" -s {prefix}"
" -a {snakemake.params.algorithm}"
" {snakemake.input.a}"
" {snakemake.input.b}")
# Move outputs into expected positions.
output_base = path.join(output_dir, prefix)
output_map = {
output_base + ".ambiguousSpeciesA.bam":
snakemake.output.a_ambiguous,
output_base + ".ambiguousSpeciesB.bam":
snakemake.output.b_ambiguous,
output_base + ".disambiguatedSpeciesA.bam":
snakemake.output.a_disambiguated,
output_base + ".disambiguatedSpeciesB.bam":
snakemake.output.b_disambiguated,
output_base + "_summary.txt":
snakemake.output.summary
}
for src, dest in output_map.items():
if src != dest:
shell('mv {src} {dest}')
PICARD¶
Wrappers¶
PICARD ADDORREPLACEREADGROUPS¶
Add or replace read groups with picard tools.
Software dependencies¶
- picard ==2.9.2
Example¶
This wrapper can be used in the following way:
rule replace_rg:
input:
"mapped/{sample}.bam"
output:
"fixed-rg/{sample}.bam"
log:
"logs/picard/replace_rg/{sample}.log"
params:
"RGLB=lib1 RGPL=illumina RGPU={sample} RGSM={sample}"
wrapper:
"0.19.2/bio/picard/addorreplacereadgroups"
Note that input, output and log file paths can be chosen freely. When running with
snakemake --use-conda
the software dependencies will be automatically deployed into an isolated environment before execution.
Authors¶
- Johannes Köster
Code¶
__author__ = "Johannes Köster"
__copyright__ = "Copyright 2016, Johannes Köster"
__email__ = "koester@jimmy.harvard.edu"
__license__ = "MIT"
from snakemake.shell import shell
shell("picard AddOrReplaceReadGroups {snakemake.params} I={snakemake.input} "
"O={snakemake.output} &> {snakemake.log}")
PICARD COLLECTALIGNMENTSUMMARYMETRICS¶
Collect metrics on aligned reads with picard tools.
Software dependencies¶
- picard ==2.9.2
Example¶
This wrapper can be used in the following way:
rule alignment_summary:
input:
ref="genome.fasta",
bam="mapped/{sample}.bam"
output:
"stats/{sample}.summary.txt"
log:
"logs/picard/alignment-summary/{sample}.log"
params:
# optional parameters (e.g. relax checks as below)
"VALIDATION_STRINGENCY=LENIENT "
"METRIC_ACCUMULATION_LEVEL=null "
"METRIC_ACCUMULATION_LEVEL=SAMPLE"
wrapper:
"0.19.2/bio/picard/collectalignmentsummarymetrics"
Note that input, output and log file paths can be chosen freely. When running with
snakemake --use-conda
the software dependencies will be automatically deployed into an isolated environment before execution.
Authors¶
- Johannes Köster
Code¶
__author__ = "Johannes Köster"
__copyright__ = "Copyright 2016, Johannes Köster"
__email__ = "johannes.koester@protonmail.com"
__license__ = "MIT"
from snakemake.shell import shell
log = snakemake.log_fmt_shell()
shell("picard CollectAlignmentSummaryMetrics {snakemake.params} "
"INPUT={snakemake.input.bam} OUTPUT={snakemake.output[0]} "
"REFERENCE_SEQUENCE={snakemake.input.ref} {log}")
PICARD COLLECTHSMETRICS¶
Collects hybrid-selection (HS) metrics for a SAM or BAM file using picard.
Software dependencies¶
- picard ==2.9.2
Example¶
This wrapper can be used in the following way:
rule picard_collect_hs_metrics:
input:
bam="mapped/{sample}.bam",
reference="genome.fasta",
# Baits and targets should be given as interval lists. These can
# be generated from bed files using picard BedToIntervalList.
bait_intervals="regions.intervals",
target_intervals="regions.intervals"
output:
"stats/hs_metrics/{sample}.txt"
params:
# Optional extra arguments. Here we reduce sample size
# to reduce the runtime in our unit test.
"SAMPLE_SIZE=1000"
log:
"logs/picard_collect_hs_metrics/{sample}.log"
wrapper:
"0.19.2/bio/picard/collecthsmetrics"
Note that input, output and log file paths can be chosen freely. When running with
snakemake --use-conda
the software dependencies will be automatically deployed into an isolated environment before execution.
Authors¶
- Julian de Ruiter
Code¶
"""Snakemake wrapper for picard CollectHSMetrics."""
__author__ = "Julian de Ruiter"
__copyright__ = "Copyright 2017, Julian de Ruiter"
__email__ = "julianderuiter@gmail.com"
__license__ = "MIT"
from snakemake.shell import shell
inputs = " ".join("INPUT={}".format(in_) for in_ in snakemake.input)
extra = snakemake.params.get("extra", "")
log = snakemake.log_fmt_shell(stdout=False, stderr=True)
shell(
"picard CollectHsMetrics"
" {extra}"
" INPUT={snakemake.input.bam}"
" OUTPUT={snakemake.output[0]}"
" REFERENCE_SEQUENCE={snakemake.input.reference}"
" BAIT_INTERVALS={snakemake.input.bait_intervals}"
" TARGET_INTERVALS={snakemake.input.target_intervals}"
" {log}")
PICARD COLLECTINSERTSIZEMETRICS¶
Collect metrics on insert size of paired end reads with picard tools.
Software dependencies¶
- picard ==2.9.2
- r-base ==3.3.2
Example¶
This wrapper can be used in the following way:
rule insert_size:
input:
"mapped/{sample}.bam"
output:
txt="stats/{sample}.isize.txt",
pdf="stats/{sample}.isize.pdf"
log:
"logs/picard/insert_size/{sample}.log"
params:
# optional parameters (e.g. relax checks as below)
"VALIDATION_STRINGENCY=LENIENT "
"METRIC_ACCUMULATION_LEVEL=null "
"METRIC_ACCUMULATION_LEVEL=SAMPLE"
wrapper:
"0.19.2/bio/picard/collectinsertsizemetrics"
Note that input, output and log file paths can be chosen freely. When running with
snakemake --use-conda
the software dependencies will be automatically deployed into an isolated environment before execution.
Authors¶
- Johannes Köster
Code¶
__author__ = "Johannes Köster"
__copyright__ = "Copyright 2016, Johannes Köster"
__email__ = "johannes.koester@protonmail.com"
__license__ = "MIT"
from snakemake.shell import shell
log = snakemake.log_fmt_shell()
shell("picard CollectInsertSizeMetrics {snakemake.params} "
"INPUT={snakemake.input} OUTPUT={snakemake.output.txt} "
"HISTOGRAM_FILE={snakemake.output.pdf} {log}")
PICARD MARKDUPLICATES¶
Mark PCR and optical duplicates with picard tools.
Software dependencies¶
- picard ==2.9.2
Example¶
This wrapper can be used in the following way:
rule mark_duplicates:
input:
"mapped/{sample}.bam"
output:
bam="dedup/{sample}.bam",
metrics="dedup/{sample}.metrics.txt"
log:
"logs/picard/dedup/{sample}.log"
params:
"REMOVE_DUPLICATES=true"
wrapper:
"0.19.2/bio/picard/markduplicates"
Note that input, output and log file paths can be chosen freely. When running with
snakemake --use-conda
the software dependencies will be automatically deployed into an isolated environment before execution.
Authors¶
- Johannes Köster
Code¶
__author__ = "Johannes Köster"
__copyright__ = "Copyright 2016, Johannes Köster"
__email__ = "koester@jimmy.harvard.edu"
__license__ = "MIT"
from snakemake.shell import shell
shell("picard MarkDuplicates {snakemake.params} INPUT={snakemake.input} "
"OUTPUT={snakemake.output.bam} METRICS_FILE={snakemake.output.metrics} "
"&> {snakemake.log}")
PICARD MERGESAMFILES¶
Merge sam/bam files using picard tools.
Software dependencies¶
- picard ==2.9.2
Example¶
This wrapper can be used in the following way:
rule merge_bams:
input:
expand("mapped/{sample}.bam", sample=["a", "b"])
output:
"merged.bam"
log:
"logs/picard_mergesamfiles.log"
params:
"VALIDATION_STRINGENCY=LENIENT"
wrapper:
"0.19.2/bio/picard/mergesamfiles"
Note that input, output and log file paths can be chosen freely. When running with
snakemake --use-conda
the software dependencies will be automatically deployed into an isolated environment before execution.
Authors¶
- Julian de Ruiter
Code¶
"""Snakemake wrapper for picard MergeSamFiles."""
__author__ = "Julian de Ruiter"
__copyright__ = "Copyright 2017, Julian de Ruiter"
__email__ = "julianderuiter@gmail.com"
__license__ = "MIT"
from snakemake.shell import shell
inputs = " ".join("INPUT={}".format(in_) for in_ in snakemake.input)
log = snakemake.log_fmt_shell(stdout=False, stderr=True)
shell(
"picard"
" MergeSamFiles"
" {snakemake.params}"
" {inputs}"
" OUTPUT={snakemake.output[0]}"
" {log}")
PICARD SORTSAM¶
Sort sam/bam files using picard tools.
Software dependencies¶
- picard ==2.9.2
Example¶
This wrapper can be used in the following way:
rule sort_bam:
input:
"mapped/{sample}.bam"
output:
"sorted/{sample}.bam"
log:
"logs/picard/sort_sam/{sample}.log"
params:
sort_order="coordinate",
extra="VALIDATION_STRINGENCY=LENIENT" # optional: Extra arguments for picard.
wrapper:
"0.19.2/bio/picard/sortsam"
Note that input, output and log file paths can be chosen freely. When running with
snakemake --use-conda
the software dependencies will be automatically deployed into an isolated environment before execution.
Authors¶
- Julian de Ruiter
Code¶
"""Snakemake wrapper for picard SortSam."""
__author__ = "Julian de Ruiter"
__copyright__ = "Copyright 2017, Julian de Ruiter"
__email__ = "julianderuiter@gmail.com"
__license__ = "MIT"
from snakemake.shell import shell
extra = snakemake.params.get("extra", "")
log = snakemake.log_fmt_shell(stdout=False, stderr=True)
shell(
'picard'
' SortSam'
' {extra}'
' INPUT={snakemake.input[0]}'
' OUTPUT={snakemake.output[0]}'
' SORT_ORDER={snakemake.params.sort_order}'
' {log}')
PINDEL¶
Wrappers¶
PINDEL¶
Call variants with pindel.
Software dependencies¶
- pindel ==0.2.5b8
Example¶
This wrapper can be used in the following way:
pindel_types = ["D", "BP", "INV", "TD", "LI", "SI", "RP"]
rule pindel:
input:
ref="genome.fasta",
# samples to call
samples=["mapped/a.bam"],
# bam configuration file, see http://gmt.genome.wustl.edu/packages/pindel/quick-start.html
config="pindel_config.txt"
output:
expand("pindel/all_{type}", type=pindel_types)
params:
# prefix must be consistent with output files
prefix="pindel/all",
extra="" # optional parameters (except -i, -f, -o)
log:
"logs/pindel.log"
threads: 4
wrapper:
"0.19.2/bio/pindel/call"
Note that input, output and log file paths can be chosen freely. When running with
snakemake --use-conda
the software dependencies will be automatically deployed into an isolated environment before execution.
Authors¶
- Johannes Köster
Code¶
__author__ = "Johannes Köster"
__copyright__ = "Copyright 2016, Johannes Köster"
__email__ = "koester@jimmy.harvard.edu"
__license__ = "MIT"
import os
from snakemake.shell import shell
extra = snakemake.params.get("extra", "")
log = snakemake.log_fmt_shell(stdout=True, stderr=True)
shell("pindel -T {snakemake.threads} {snakemake.params.extra} -i {snakemake.input.config} "
"-f {snakemake.input.ref} -o {snakemake.params.prefix} {log}")
PINDEL2VCF¶
Convert pindel output to vcf.
Software dependencies¶
- pindel ==0.2.5b8
Example¶
This wrapper can be used in the following way:
rule pindel2vcf:
input:
ref="genome.fasta",
pindel="pindel/all_{type}"
output:
"pindel/all_{type}.vcf"
params:
refname="hg38", # mandatory, see pindel manual
refdate="20170110", # mandatory, see pindel manual
extra="" # extra params (except -r, -p, -R, -d, -v)
log:
"logs/pindel/pindel2vcf.{type}.log"
wrapper:
"0.19.2/bio/pindel/pindel2vcf"
Note that input, output and log file paths can be chosen freely. When running with
snakemake --use-conda
the software dependencies will be automatically deployed into an isolated environment before execution.
Authors¶
- Johannes Köster
Code¶
__author__ = "Johannes Köster"
__copyright__ = "Copyright 2016, Johannes Köster"
__email__ = "koester@jimmy.harvard.edu"
__license__ = "MIT"
import os
from snakemake.shell import shell
extra = snakemake.params.get("extra", "")
log = snakemake.log_fmt_shell(stdout=True, stderr=True)
shell("pindel2vcf {snakemake.params.extra} -p {snakemake.input.pindel} -r {snakemake.input.ref} -R {snakemake.params.refname} -d {snakemake.params.refdate} -v {snakemake.output[0]} {log}")
SAMBAMBA¶
Wrappers¶
SAMBAMBA SORT¶
Sort bam file with sambamba
Software dependencies¶
- sambamba ==0.6.6
Example¶
This wrapper can be used in the following way:
rule sambamba_sort:
input:
"mapped/{sample}.bam"
output:
"mapped/{sample}.sorted.bam"
params:
"" # optional parameters
threads: 8
wrapper:
"0.19.2/bio/sambamba/sort"
Note that input, output and log file paths can be chosen freely. When running with
snakemake --use-conda
the software dependencies will be automatically deployed into an isolated environment before execution.
Authors¶
- Johannes Köster
Code¶
__author__ = "Johannes Köster"
__copyright__ = "Copyright 2016, Johannes Köster"
__email__ = "koester@jimmy.harvard.edu"
__license__ = "MIT"
import os
from snakemake.shell import shell
shell(
"sambamba sort {snakemake.params} -t {snakemake.threads} "
"-o {snakemake.output[0]} {snakemake.input[0]}")
SAMTOOLS¶
Wrappers¶
SAMTOOLS FLAGSTAT¶
Use samtools to create a flagstat file from a bam or sam file.
Software dependencies¶
- samtools ==1.3
Example¶
This wrapper can be used in the following way:
rule samtools_flagstat:
input: "mapped/{sample}.bam"
output: "mapped/{sample}.bam.flagstat"
wrapper:
"0.19.2/bio/samtools/flagstat"
Note that input, output and log file paths can be chosen freely. When running with
snakemake --use-conda
the software dependencies will be automatically deployed into an isolated environment before execution.
Authors¶
- Christopher Preusch
Code¶
__author__ = "Christopher Preusch"
__copyright__ = "Copyright 2017, Christopher Preusch"
__email__ = "cpreusch[at]ust.hk"
__license__ = "MIT"
from snakemake.shell import shell
shell("samtools flagstat {snakemake.input[0]} > {snakemake.output[0]}")
SAMTOOLS INDEX¶
Index bam file with samtools.
Software dependencies¶
- samtools ==1.5
Example¶
This wrapper can be used in the following way:
rule samtools_index:
input: "mapped/{sample}.sorted.bam"
output: "mapped/{sample}.sorted.bam.bai"
params:
"" # optional params string
wrapper:
"0.19.2/bio/samtools/index"
Note that input, output and log file paths can be chosen freely. When running with
snakemake --use-conda
the software dependencies will be automatically deployed into an isolated environment before execution.
Authors¶
- Johannes Köster
Code¶
__author__ = "Johannes Köster"
__copyright__ = "Copyright 2016, Johannes Köster"
__email__ = "koester@jimmy.harvard.edu"
__license__ = "MIT"
from snakemake.shell import shell
shell("samtools index {snakemake.params} {snakemake.input[0]} {snakemake.output[0]}")
SAMTOOLS MERGE¶
Merge two bam files with samtools.
Software dependencies¶
- samtools ==1.5
Example¶
This wrapper can be used in the following way:
rule samtools_merge:
input:
["mapped/A.bam", "mapped/B.bam"]
output:
"merged.bam"
params:
"" # optional additional parameters as string
threads: 8
wrapper:
"0.19.2/bio/samtools/merge"
Note that input, output and log file paths can be chosen freely. When running with
snakemake --use-conda
the software dependencies will be automatically deployed into an isolated environment before execution.
Authors¶
- Johannes Köster
Code¶
__author__ = "Johannes Köster"
__copyright__ = "Copyright 2016, Johannes Köster"
__email__ = "koester@jimmy.harvard.edu"
__license__ = "MIT"
from snakemake.shell import shell
shell("samtools merge --threads {snakemake.threads} {snakemake.params} "
"{snakemake.output[0]} {snakemake.input}")
SAMTOOLS SORT¶
Sort bam file with samtools.
Software dependencies¶
- samtools ==1.5
Example¶
This wrapper can be used in the following way:
rule samtools_sort:
input:
"mapped/{sample}.bam"
output:
"mapped/{sample}.sorted.bam"
params:
"-m 4G"
threads: 8
wrapper:
"0.19.2/bio/samtools/sort"
Note that input, output and log file paths can be chosen freely. When running with
snakemake --use-conda
the software dependencies will be automatically deployed into an isolated environment before execution.
Authors¶
- Johannes Köster
Code¶
__author__ = "Johannes Köster"
__copyright__ = "Copyright 2016, Johannes Köster"
__email__ = "koester@jimmy.harvard.edu"
__license__ = "MIT"
import os
from snakemake.shell import shell
prefix = os.path.splitext(snakemake.output[0])[0]
shell(
"samtools sort {snakemake.params} -@ {snakemake.threads} -o {snakemake.output[0]} "
"-T {prefix} {snakemake.input[0]}")
SAMTOOLS STATS¶
Generate stats using samtools.
Software dependencies¶
- samtools ==1.5
Example¶
This wrapper can be used in the following way:
rule samtools_stats:
input:
"mapped/{sample}.bam"
output:
"samtools_stats/{sample}.txt"
params:
extra="", # Optional: extra arguments.
region="1:1000000-2000000" # Optional: region string.
log:
"logs/samtools_stats/{sample}.log"
wrapper:
"0.19.2/bio/samtools/stats"
Note that input, output and log file paths can be chosen freely. When running with
snakemake --use-conda
the software dependencies will be automatically deployed into an isolated environment before execution.
Authors¶
- Julian de Ruiter
Code¶
"""Snakemake wrapper for trimming paired-end reads using cutadapt."""
__author__ = "Julian de Ruiter"
__copyright__ = "Copyright 2017, Julian de Ruiter"
__email__ = "julianderuiter@gmail.com"
__license__ = "MIT"
from snakemake.shell import shell
extra = snakemake.params.get("extra", "")
region = snakemake.params.get("region", "")
log = snakemake.log_fmt_shell(stdout=False, stderr=True)
shell("samtools stats {extra} {snakemake.input}"
" {region} > {snakemake.output} {log}")
SAMTOOLS VIEW¶
Convert or filter SAM/BAM.
Software dependencies¶
- samtools ==1.5
Example¶
This wrapper can be used in the following way:
rule samtools_view:
input:
"{sample}.sam"
output:
"{sample}.bam"
params:
"-b" # optional params string
wrapper:
"0.19.2/bio/samtools/view"
Note that input, output and log file paths can be chosen freely. When running with
snakemake --use-conda
the software dependencies will be automatically deployed into an isolated environment before execution.
Authors¶
- Johannes Köster
Code¶
__author__ = "Johannes Köster"
__copyright__ = "Copyright 2016, Johannes Köster"
__email__ = "koester@jimmy.harvard.edu"
__license__ = "MIT"
from snakemake.shell import shell
shell("samtools view {snakemake.params} {snakemake.input[0]} > {snakemake.output[0]}")
SICKLE¶
Wrappers¶
SICKLE PE¶
Trim paired-end reads with sickle.
Software dependencies¶
- sickle-trim ==1.33
Example¶
This wrapper can be used in the following way:
rule sickle_pe:
input:
r1="input_R1.fq",
r2="input_R2.fq"
output:
r1="output_R1.fq",
r2="output_R2.fq",
rs="output_single.fq",
params:
qual_type="sanger",
# optional extra parameters
extra=""
log:
# optional log file
"logs/sickle/job.log"
wrapper:
"0.19.2/bio/sickle/pe"
Note that input, output and log file paths can be chosen freely. When running with
snakemake --use-conda
the software dependencies will be automatically deployed into an isolated environment before execution.
Authors¶
- Wibowo Arindrarto
Code¶
__author__ = "Wibowo Arindrarto"
__copyright__ = "Copyright 2016, Wibowo Arindrarto"
__email__ = "bow@bow.web.id"
__license__ = "BSD"
from snakemake.shell import shell
# Placeholder for optional parameters
extra = snakemake.params.get("extra", "")
log = snakemake.log_fmt_shell()
shell(
"(sickle pe -f {snakemake.input.r1} -r {snakemake.input.r2} "
"-o {snakemake.output.r1} -p {snakemake.output.r2} "
"-s {snakemake.output.rs} -t {snakemake.params.qual_type} "
"{extra}) {log}"
)
SICKLE SE¶
Trim single-end reads with sickle.
Software dependencies¶
- sickle-trim ==1.33
Example¶
This wrapper can be used in the following way:
rule sickle_pe:
input:
"input_R1.fq"
output:
"output_R1.fq"
params:
qual_type="sanger",
# optional extra parameters
extra=""
log:
"logs/sickle/job.log"
wrapper:
"0.19.2/bio/sickle/pe"
Note that input, output and log file paths can be chosen freely. When running with
snakemake --use-conda
the software dependencies will be automatically deployed into an isolated environment before execution.
Authors¶
- Wibowo Arindrarto
Code¶
__author__ = "Wibowo Arindrarto"
__copyright__ = "Copyright 2016, Wibowo Arindrarto"
__email__ = "bow@bow.web.id"
__license__ = "BSD"
from snakemake.shell import shell
# Placeholder for optional parameters
extra = snakemake.params.get("extra", "")
log = snakemake.log_fmt_shell()
shell(
"(sickle se -f {snakemake.input[0]} -o {snakemake.output[0]} "
"-t {snakemake.params.qual_type} {extra}) {log}"
)
STAR¶
Wrappers¶
STAR¶
Map reads with STAR.
Software dependencies¶
- star ==2.5.3a
Example¶
This wrapper can be used in the following way:
rule star:
input:
sample=["reads/{sample}.1.fastq", "reads/{sample}.2.fastq"]
output:
# see STAR manual for additional output files
"star/{sample}/Aligned.out.bam"
log:
"logs/star/{sample}.log"
params:
# path to STAR reference genome index
index="index",
# optional parameters
extra=""
threads: 8
wrapper:
"0.19.2/bio/star/align"
Note that input, output and log file paths can be chosen freely. When running with
snakemake --use-conda
the software dependencies will be automatically deployed into an isolated environment before execution.
Authors¶
- Johannes Köster
Code¶
__author__ = "Johannes Köster"
__copyright__ = "Copyright 2016, Johannes Köster"
__email__ = "koester@jimmy.harvard.edu"
__license__ = "MIT"
import os
from snakemake.shell import shell
extra = snakemake.params.get("extra", "")
log = snakemake.log_fmt_shell(stdout=True, stderr=True)
n = len(snakemake.input.sample)
assert n == 1 or n == 2, "input->sample must have 1 (single-end) or 2 (paired-end) elements."
if snakemake.input.sample[0].endswith(".gz"):
readcmd = "--readFilesCommand zcat"
else:
readcmd = ""
outprefix = os.path.dirname(snakemake.output[0]) + "/"
shell(
"STAR "
"{snakemake.params.extra} "
"--runThreadN {snakemake.threads} "
"--genomeDir {snakemake.params.index} "
"--readFilesIn {snakemake.input.sample} "
"{readcmd} "
"--outSAMtype BAM Unsorted "
"--outFileNamePrefix {outprefix} "
"--outStd Log "
"{log}")
TRIMMOMATIC¶
Wrappers¶
TRIMMOMATIC PE¶
Trim paired-end reads with trimmomatic.
Software dependencies¶
- trimmomatic ==0.36
Example¶
This wrapper can be used in the following way:
rule trimmomatic_pe:
input:
r1="reads/{sample}.1.fastq",
r2="reads/{sample}.2.fastq"
output:
r1="trimmed/{sample}.1.fastq.gz",
r2="trimmed/{sample}.2.fastq.gz",
# reads where trimming entirely removed the mate
r1_unpaired="trimmed/{sample}.1.unpaired.fastq.gz",
r2_unpaired="trimmed/{sample}.2.unpaired.fastq.gz"
log:
"logs/trimmomatic/{sample}.log"
params:
# list of trimmers (see manual)
trimmer=["TRAILING:3"],
# optional parameters
extra=""
wrapper:
"0.19.2/bio/trimmomatic/pe"
Note that input, output and log file paths can be chosen freely. When running with
snakemake --use-conda
the software dependencies will be automatically deployed into an isolated environment before execution.
Authors¶
- Johannes Köster
Code¶
__author__ = "Johannes Köster"
__copyright__ = "Copyright 2016, Johannes Köster"
__email__ = "koester@jimmy.harvard.edu"
__license__ = "MIT"
from snakemake.shell import shell
extra = snakemake.params.get("extra", "")
log = snakemake.log_fmt_shell(stdout=True, stderr=True)
trimmer = " ".join(snakemake.params.trimmer)
shell("trimmomatic PE {snakemake.params.extra} "
"{snakemake.input.r1} {snakemake.input.r2} "
"{snakemake.output.r1} {snakemake.output.r1_unpaired} "
"{snakemake.output.r2} {snakemake.output.r2_unpaired} "
"{trimmer} "
"{log}")
TRIMMOMATIC SE¶
Trim single-end reads with trimmomatic.
Software dependencies¶
- trimmomatic ==0.36
Example¶
This wrapper can be used in the following way:
rule trimmomatic_pe:
input:
"reads/{sample}.fastq"
output:
"trimmed/{sample}.fastq.gz"
log:
"logs/trimmomatic/{sample}.log"
params:
# list of trimmers (see manual)
trimmer=["TRAILING:3"],
# optional parameters
extra=""
wrapper:
"0.19.2/bio/trimmomatic/se"
Note that input, output and log file paths can be chosen freely. When running with
snakemake --use-conda
the software dependencies will be automatically deployed into an isolated environment before execution.
Authors¶
- Johannes Köster
Code¶
__author__ = "Johannes Köster"
__copyright__ = "Copyright 2016, Johannes Köster"
__email__ = "koester@jimmy.harvard.edu"
__license__ = "MIT"
from snakemake.shell import shell
extra = snakemake.params.get("extra", "")
log = snakemake.log_fmt_shell(stdout=True, stderr=True)
trimmer = " ".join(snakemake.params.trimmer)
shell("trimmomatic SE {snakemake.params.extra} "
"{snakemake.input} {snakemake.output} "
"{trimmer} "
"{log}")
VCF¶
Wrappers¶
COMPRESS VCF¶
Compress and index vcf file with bgzip and tabix.
Software dependencies¶
- htslib ==1.5
Example¶
This wrapper can be used in the following way:
rule compress_vcf:
input:
"{prefix}.vcf"
output:
"{prefix}.vcf.gz"
wrapper:
"0.19.2/bio/vcf/compress"
Note that input, output and log file paths can be chosen freely. When running with
snakemake --use-conda
the software dependencies will be automatically deployed into an isolated environment before execution.
Authors¶
- Johannes Köster
Code¶
__author__ = "Johannes Köster"
__copyright__ = "Copyright 2016, Johannes Köster"
__email__ = "koester@jimmy.harvard.edu"
__license__ = "MIT"
from snakemake.shell import shell
shell("bgzip --stdout {snakemake.input} > {snakemake.output} && tabix -p vcf {snakemake.output}")
UNCOMPRESS VCF¶
Uncompress vcf file with bgzip.
Software dependencies¶
- htslib ==1.5
Example¶
This wrapper can be used in the following way:
rule uncompress_vcf:
input:
"{prefix}.vcf.gz"
output:
"{prefix}.vcf"
wrapper:
"0.19.2/bio/vcf/uncompress"
Note that input, output and log file paths can be chosen freely. When running with
snakemake --use-conda
the software dependencies will be automatically deployed into an isolated environment before execution.
Authors¶
- Johannes Köster
Code¶
__author__ = "Johannes Köster"
__copyright__ = "Copyright 2016, Johannes Köster"
__email__ = "koester@jimmy.harvard.edu"
__license__ = "MIT"
from snakemake.shell import shell
shell("bgzip --decompress --stdout {snakemake.input} > {snakemake.output}")
\ Sort by:\ best rated\ newest\ oldest\
\\
Add a comment\ (markup):
\``code``
, \ code blocks:::
and an indented block after blank line