BWA-METH MEMX

https://img.shields.io/badge/wrapper_version-v9.5.0-10785b https://img.shields.io/github/issues-pr/snakemake/snakemake-wrappers/bio/bwameth/memx?label=version%20update%20pull%20requests&color=1cb481

Align BS-Seq reads on an indexed reference

URL: https://github.com/brentp/bwa-meth?tab=readme-ov-file#align

Example

This wrapper can be used in the following way:

rule test_bwameth_mem:
    input:
        ref="mem/genome.fasta",
        idx=multiext(
            "mem/genome.fasta.bwameth",
            ".c2t",
            ".c2t.amb",
            ".c2t.ann",
            ".c2t.bwt",
            ".c2t.pac",
            ".c2t.sa",
        ),
        fq=["A.fastq"],
    output:
        "A.mem.bam",
    log:
        "test_bwameth_mem.log",
    threads: 2
    params:
        extra="",  # Optional parameters for bwameth.py, besides IO and threading
        sort="none",  # Either `none` (default), `samtools`, or `picard`
        sort_order="coordinate",  # Either `coordinate` (default) or `queryname`
        sort_extra="",  # Optional parameters for sambamba/samtools/picard. Ignored if `sort` is equal to `none`.
    wrapper:
        "v9.5.0/bio/bwameth/memx"


rule test_bwameth_mem2:
    input:
        ref="mem2/genome.fasta",
        idx=multiext(
            "mem2/genome.fasta.bwameth",
            ".c2t",
            ".c2t.amb",
            ".c2t.ann",
            ".c2t.bwt.2bit.64",
            ".c2t.pac",
            ".c2t.0123",
        ),
        fq=["A.fastq"],
    output:
        "A.mem2.bam",
    log:
        "test_bwameth_mem2.log",
    threads: 2
    params:
        extra="",  # Optional parameters for bwameth.py, besides IO and threading
        sort="none",  # Either `none` (default), `samtools`, or `picard`
        sort_order="coordinate",  # Either `coordinate` (default) or `queryname`
        sort_extra="",  # Optional parameters for sambamba/samtools/picard. Ignored if `sort` is equal to `none`.
    wrapper:
        "v9.5.0/bio/bwameth/memx"


rule test_bwameth_mem_pe:
    input:
        ref="mem/genome.fasta",
        idx=multiext(
            "mem/genome.fasta.bwameth",
            ".c2t",
            ".c2t.amb",
            ".c2t.ann",
            ".c2t.bwt",
            ".c2t.pac",
            ".c2t.sa",
        ),
        fq1="A.fastq",
        fq2="B.fastq",
    output:
        "AB_pe.mem.bam",
    log:
        "test_bwameth_mem.log",
    threads: 2
    params:
        extra="",  # Optional parameters for bwameth.py, besides IO and threading
        sort="none",  # Either `none` (default), `samtools`, or `picard`
        sort_order="coordinate",  # Either `coordinate` (default) or `queryname`
        sort_extra="",  # Optional parameters for sambamba/samtools/picard. Ignored if `sort` is equal to `none`.
    wrapper:
        "v9.5.0/bio/bwameth/memx"


rule test_bwameth_mem_sort_picard:
    input:
        ref="mem/genome.fasta",
        idx=multiext(
            "mem/genome.fasta.bwameth",
            ".c2t",
            ".c2t.amb",
            ".c2t.ann",
            ".c2t.bwt",
            ".c2t.pac",
            ".c2t.sa",
        ),
        fq=["A.fastq"],
    output:
        "A.picard_sort.bam",
    log:
        "test_bwameth_mem_sort_picard.log",
    threads: 2
    params:
        extra="",  # Optional parameters for bwameth.py, besides IO and threading
        sort="picard",  # Either `none` (default), `samtools`, or `picard`
        sort_order="coordinate",  # Either `coordinate` (default) or `queryname`
        sort_extra="",  # Optional parameters for sambamba/samtools/picard. Ignored if `sort` is equal to `none`.
    wrapper:
        "v9.5.0/bio/bwameth/memx"


rule test_bwameth_mem_sort_samtools:
    input:
        ref="mem/genome.fasta",
        idx=multiext(
            "mem/genome.fasta.bwameth",
            ".c2t",
            ".c2t.amb",
            ".c2t.ann",
            ".c2t.bwt",
            ".c2t.pac",
            ".c2t.sa",
        ),
        fq=["A.fastq"],
    output:
        "A.samtools_sort.bam",
    log:
        "test_bwameth_mem_sort_samtools.log",
    threads: 2
    params:
        extra="",  # Optional parameters for bwameth.py, besides IO and threading
        sort="samtools",  # Either `none` (default), `samtools`, or `picard`
        sort_order="coordinate",  # Either `coordinate` (default) or `queryname`
        sort_extra="",  # Optional parameters for sambamba/samtools/picard. Ignored if `sort` is equal to `none`.
    wrapper:
        "v9.5.0/bio/bwameth/memx"

Note that input, output and log file paths can be chosen freely.

When running with

snakemake --use-conda

the software dependencies will be automatically deployed into an isolated environment before execution.

Notes

Both SAM sorting and compression and handled by either samtools or picard. No duplication analysis is performed by this wrapper.

Software dependencies

  • bwameth=0.2.9

  • samtools=1.23.1

  • picard-slim=3.4.0

  • snakemake-wrapper-utils=0.8.0

Input/Output

Input:

  • idx: List of paths to indexed reference files

  • fq1: List of FASTQ files (multiple files will be merged before alignment).

  • fq2: List of paired-end FASTQ files (multiple files will be merged before alignment).

  • ref: Path to reference fasta file

Output:

  • Path to mapped reads (SAM/BAM/CRAM)

Params

  • extra: Optional parameters passed to bwameth.py, besides IO and threading.

  • sort: Either none (default), picard, or samtools.

  • sort_order: Either coordinate (default), or queryname.

  • sort_extra: Optional parameters provided to samtools/picard besides IO and threading.

Authors

  • Thibault Dayris

Code

# coding: utf-8

"""
Snakemake wrapper for bwameth

Proceeed according to other bwa wrappers: Align + optional sort
"""

__author__ = "Thibault Dayris"
__mail__ = "thibault.dayris@gustaveroussy.fr"
__copyright__ = "Copyright 2024, Thibault Dayris"
__license__ = "MIT"

import os.path
import time

from tempfile import TemporaryDirectory
from snakemake import shell
from snakemake_wrapper_utils.java import get_java_opts
from snakemake_wrapper_utils.samtools import get_samtools_opts

# Extract arguments
extra = snakemake.params.get("extra", "")
log = snakemake.log_fmt_shell(stdout=False, stderr=True)

sort = snakemake.params.get("sort", "none")
sort_order = snakemake.params.get("sort_order", "coordinate")
sort_extra = snakemake.params.get("sort_extra", "")
samtools_opts = get_samtools_opts(snakemake, param_name="sort_extra")
java_opts = get_java_opts(snakemake)

bwa_threads = snakemake.threads
samtools_threads = snakemake.threads - 1

# Check arguments
if sort_order not in {"coordinate", "queryname"}:
    raise ValueError(f"Unexpected value for sort_order ({sort_order})")

# Determine which pipe command to use for converting to bam or sorting.
if sort == "none":
    # Correctly assign number of threads according to user request
    if samtools_threads >= 1:
        samtools_opts += f" --threads {samtools_threads} "

    if str(snakemake.output[0]).lower().endswith(("bam", "cram")):
        # Simply convert to bam using samtools view.
        pipe_cmd = f" | samtools view {samtools_opts} > {snakemake.output[0]}"
    else:
        # Do not perform any sort nor compression, output raw sam
        pipe_cmd = f" > {snakemake.output[0]} "


elif sort == "samtools":
    # Correctly assign number of threads according to user request
    if samtools_threads >= 1:
        samtools_opts += " --threads {samtools_threads} "

    # Add name flag if needed.
    if sort_order == "queryname":
        sort_extra += " -n"

    # Sort alignments using samtools sort.
    pipe_cmd = " | samtools sort {samtools_opts} {sort_extra} -T {tmpdir} > {snakemake.output[0]}"


elif sort == "picard":
    # Correctly assign number of threads according to user request
    bwa_threads -= 1
    if bwa_threads <= 0:
        raise ValueError(
            "Not enough threads requested. This wrapper requires exactly one more."
        )

    # Sort alignments using picard SortSam.
    pipe_cmd = (
        " | picard SortSam {java_opts} {sort_extra} "
        "--INPUT /dev/stdin "
        "--TMP_DIR {tmpdir} "
        "--SORT_ORDER {sort_order} "
        "--OUTPUT {snakemake.output[0]}"
    )


else:
    raise ValueError(f"Unexpected value for params.sort ({sort})")

# Gathering fastq files to align
fq = snakemake.input.get("fq")
fq1 = snakemake.input.get("fq1")
fq2 = snakemake.input.get("fq2")

# Single-ended case
if fq:
    if isinstance(fq, list):
        fastq_files = ",".join(fq)
    else:
        fastq_files = fq

# Pair-ended case
elif fq1 and fq2:
    if isinstance(fq1, list) and isinstance(fq2, list):
        if len(fq1) == len(fq2):
            fastq_files = f'{",".join(fq1)} {",".join(fq2)}'
        else:
            raise ValueError("Please provide as many R1 and R2 files")
    else:
        fastq_files = f"{fq1} {fq2}"

# Missing fastq case
else:
    raise ValueError("Either provide `input.fq` or both `input.fq1` and `input.fq2`")

with TemporaryDirectory() as tmpdir:
    pipe_cmd = pipe_cmd.format(**locals())
    shell(
        "(bwameth.py --threads {snakemake.threads} "
        " {extra} --reference {snakemake.input.ref} "
        " {fastq_files} {pipe_cmd} ) {log}"
    )