SAMTOOLS FASTQ SEPARATE
Convert a bam file with paired end reads back to unaligned reads in a two separate fastq files with samtools. Reads that are not properly paired are discarded (READ_OTHER and singleton reads in samtools fastq documentation), as are secondary (0x100) and supplementary reads (0x800).
Example
This wrapper can be used in the following way:
rule samtools_fastq_separate:
input:
"mapped/{sample}.bam",
output:
"reads/{sample}.1.fq",
"reads/{sample}.2.fq",
log:
"{sample}.separate.log",
params:
collate="",
fastq="-n",
# Remember, this is the number of samtools' additional threads. At least 2 threads have to be requested on cluster sumbission. This value - 2 will be sent to samtools sort -@ argument.
threads: 3
wrapper:
"v4.6.0-24-g250dd3e/bio/samtools/fastq/separate"
Note that input, output and log file paths can be chosen freely.
When running with
snakemake --use-conda
the software dependencies will be automatically deployed into an isolated environment before execution.
Notes
The extra param allows for additional program arguments.
For more information see, http://www.htslib.org/doc/samtools-fasta.html
Software dependencies
samtools=1.14
snakemake-wrapper-utils=0.5.2
Code
__author__ = "David Laehnemann, Victoria Sack"
__copyright__ = "Copyright 2018, David Laehnemann, Victoria Sack"
__email__ = "david.laehnemann@hhu.de"
__license__ = "MIT"
import os
import tempfile
from pathlib import Path
from snakemake.shell import shell
from snakemake_wrapper_utils.snakemake import get_mem
params_collate = snakemake.params.get("collate", "")
params_fastq = snakemake.params.get("fastq", "")
log = snakemake.log_fmt_shell(stdout=True, stderr=True)
# Samtools takes additional threads through its option -@
# One thread is used by Samtools collate
# One thread is used by Samtools fastq
# So snakemake.threads has to take them into account
# before allowing additional threads through samtools sort -@
threads = 0 if snakemake.threads <= 2 else snakemake.threads - 2
shell(
"(samtools collate -u -O"
" --threads {threads}"
" {params_collate}"
" {snakemake.input[0]} | "
"samtools fastq"
" {params_fastq}"
" -1 {snakemake.output[0]}"
" -2 {snakemake.output[1]}"
" -0 /dev/null"
" -s /dev/null"
" -F 0x900"
" - "
") {log}"
)