SAMTOOLS FASTQ SEPARATE

https://img.shields.io/github/issues-pr/snakemake/snakemake-wrappers/bio/samtools/fastq/separate?label=version%20update%20pull%20requests

Convert a bam file with paired end reads back to unaligned reads in a two separate fastq files with samtools. Reads that are not properly paired are discarded (READ_OTHER and singleton reads in samtools fastq documentation), as are secondary (0x100) and supplementary reads (0x800).

Example

This wrapper can be used in the following way:

rule samtools_fastq_separate:
    input:
        "mapped/{sample}.bam",
    output:
        "reads/{sample}.1.fq",
        "reads/{sample}.2.fq",
    log:
        "{sample}.separate.log",
    params:
        collate="",
        fastq="-n",
    # Remember, this is the number of samtools' additional threads. At least 2 threads have to be requested on cluster sumbission. This value - 2 will be sent to samtools sort -@ argument.
    threads: 3
    wrapper:
        "v3.12.1/bio/samtools/fastq/separate"

Note that input, output and log file paths can be chosen freely.

When running with

snakemake --use-conda

the software dependencies will be automatically deployed into an isolated environment before execution.

Notes

Software dependencies

  • samtools=1.14

  • snakemake-wrapper-utils=0.5.2

Authors

  • David Laehnemann

  • Victoria Sack

  • Filipe G. Vieira

Code

__author__ = "David Laehnemann, Victoria Sack"
__copyright__ = "Copyright 2018, David Laehnemann, Victoria Sack"
__email__ = "david.laehnemann@hhu.de"
__license__ = "MIT"


import os
import tempfile
from pathlib import Path
from snakemake.shell import shell
from snakemake_wrapper_utils.snakemake import get_mem

params_collate = snakemake.params.get("collate", "")
params_fastq = snakemake.params.get("fastq", "")
log = snakemake.log_fmt_shell(stdout=True, stderr=True)

# Samtools takes additional threads through its option -@
# One thread is used by Samtools collate
# One thread is used by Samtools fastq
# So snakemake.threads has to take them into account
# before allowing additional threads through samtools sort -@
threads = 0 if snakemake.threads <= 2 else snakemake.threads - 2

shell(
    "(samtools collate -u -O"
    " --threads {threads}"
    " {params_collate}"
    " {snakemake.input[0]} | "
    "samtools fastq"
    " {params_fastq}"
    " -1 {snakemake.output[0]}"
    " -2 {snakemake.output[1]}"
    " -0 /dev/null"
    " -s /dev/null"
    " -F 0x900"
    " - "
    ") {log}"
)