BOWTIE2 SAMBAMBA
Map reads with Bowtie 2, and post-process alignments with Sambamba:
Step
Tool
Reason
Indexation
Bowtie 2
Create genome sequence index
Mapping
Bowtie 2
Perform read mapping
Sort
Sambamba
Perform sort based on mapping position
Quality filter
Sambamba
Perform mapping quality filter
Deduplication
Sambamba
Identify possible sequencing duplicates
Indexation
Sambamba
Index deduplicated reads
Usage
Via module
This usage is recommended with Snakemake >=7.9.
You can include this meta-wrapper in your workflow via the Snakemake module system:
module bowtie2 sambamba:
meta_wrapper: "v8.0.0/meta/bio/bowtie2_sambamba"
pathvars:
results="...", # Path to results directory
resources="...", # Path to resources directory
logs="...", # Path to logs directory
genome_sequence="...", # Path to FASTA file with genome sequence
genome_annotation="...", # Path to GTF file with genome annotation
reads_r1="...", # Path/pattern for FASTQ files with R1 reads
reads_r2="...", # Path/pattern for FASTQ files with R2 reads
use rule * from bowtie2 sambamba as bowtie2 sambamba_*
Upon using the rules, you can additionally modify input, output, log, and params as needed (see the definition of each rule below and the modules documentation). For additional parameters in each individual wrapper, please refer to their corresponding documentation (see links below).
Via copy-paste
Alternatively, you can directly copy-paste and modify the full meta-wrapper code below into your workflow.
Execution
When running with
snakemake --sdm conda
the software dependencies will be automatically deployed into an isolated environment before execution.
Used wrappers
The following individual wrappers are used in this meta-wrapper:
Please refer to each wrapper in above list for additional configuration parameters and information about the executed code.
Code
rule bowtie2_build:
input:
ref="<genome_sequence>",
output:
multiext(
"<resources>/bowtie_index/genome",
".1.bt2",
".2.bt2",
".3.bt2",
".4.bt2",
".rev.1.bt2",
".rev.2.bt2",
),
log:
"<logs>/bowtie2_build/build.log",
params:
extra="",
threads: 8
wrapper:
"v3.11.0/bio/bowtie2/build"
rule bowtie2_alignment:
input:
sample=["<reads_r1>", "<reads_r2>"],
idx=multiext(
"<resources>/bowtie_index/genome",
".1.bt2",
".2.bt2",
".3.bt2",
".4.bt2",
".rev.1.bt2",
".rev.2.bt2",
),
output:
temp("<results>/mapped/{sample}.bam"),
log:
"<logs>/bowtie2/{sample}.log",
params:
extra=(
" --rg-id {sample} "
"--rg 'SM:{sample} LB:FakeLib PU:FakePU.1.{sample} PL:ILLUMINA' "
),
threads: 8
wrapper:
"v7.6.0/bio/bowtie2/align"
rule sambamba_sort:
input:
"<results>/mapped/{sample}.bam",
output:
temp("<results>/mapped/{sample}.sorted.bam"),
params:
"",
log:
"<logs>/sambamba-sort/{sample}.log",
threads: 8
wrapper:
"v3.11.0/bio/sambamba/sort"
rule sambamba_view:
input:
"<results>/mapped/{sample}.sorted.bam",
output:
temp("<results>/mapped/{sample}.filtered.bam"),
params:
extra=(
" --format 'bam' "
"--filter 'mapping_quality >= 30 and not (unmapped or mate_is_unmapped)' "
),
log:
"logs/sambamba-view/{sample}.log",
threads: 8
wrapper:
"v6.1.0/bio/sambamba/view"
rule sambamba_markdup:
input:
"<results>/mapped/{sample}.filtered.bam",
output:
"<results>/mapped/{sample}.rmdup.bam",
params:
extra=" --remove-duplicates ", # optional parameters
log:
"<logs>/sambamba-markdup/{sample}.log",
threads: 8
wrapper:
"v6.1.0/bio/sambamba/markdup"
rule sambamba_index:
input:
"<results>/mapped/{sample}.rmdup.bam",
output:
"<results>/mapped/{sample}.rmdup.bam.bai",
params:
extra="",
log:
"<logs>/sambamba-index/{sample}.log",
threads: 8
wrapper:
"v6.1.0/bio/sambamba/index"