DEEPTOOLS BAMPEFRAGMENTSIZE

https://img.shields.io/github/issues-pr/snakemake/snakemake-wrappers/bio/deeptools/bampefragmentsize?label=version%20update%20pull%20requests

bamPEFragmentSize calculates the fragment sizes for read pairs in paired-end sequencing BAM files. It generates a histogram of fragment sizes and can output the raw fragment length data. For usage information about deepTools bamPEFragmentSize, please see the [documentation](https://deeptools.readthedocs.io/en/latest/content/tools/bamPEFragmentSize.html). For more information about deepTools, also see the [source code](https://github.com/deeptools/deepTools).

URL: https://deeptools.readthedocs.io/en/latest/content/tools/bamPEFragmentSize.html

Example

This wrapper can be used in the following way:

rule deeptools_bampe_fragmentsize:
    input:
        # Input BAM file(s)
        bams=["a.bam", "b.bam"],
        # Optional blacklist file in BED format to exclude specific regions from analysis
        # blacklist="",
    output:
        # Please note that -o/hist/--histogram and --outRawFragmentLengths are exclusively defined via output files.
        # Usable output variables, their extensions and which option they implicitly call are listed here:
        # https://snakemake-wrappers.readthedocs.io/en/stable/wrappers/deeptools/bamPEFragmentSize.html.
        # Required
        hist="results/histogram.png",
        # Optional output files
        raw="results/raw.tab",
    log:
        "logs/deeptools/bampe_fragmentsize.log",
    threads: 4
    params:
        # Labels can be changed to anything
        # If left empty, the sample name will be used
        # (without path and .bam extension)
        # Format: list matching the number of input BAMs
        # Example: ["sample1", "sample2"] or "" for automatic labels
        labels="",
        # Additional parameters for deeptools bamPEFragmentSize
        # Example: --maxFragmentLength 1000 --binSize 10
        extra="--logScale",
    wrapper:
        "v6.1.0/bio/deeptools/bampefragmentsize"

Note that input, output and log file paths can be chosen freely.

When running with

snakemake --use-conda

the software dependencies will be automatically deployed into an isolated environment before execution.

Software dependencies

  • deeptools=3.5.6

Input/Output

Input:

  • bams: List of BAM files (.bam)

  • blacklist: Optional BED file with regions to skip (.bed)

Output:

  • hist: Fragment size histogram (.png)

  • raw: Raw fragment lengths (.tab) (optional)

Params

  • label: Labels for plotting (list of string or “” for automatic labelling)

  • extra: Optional parameters given to bamPEFragmentSize

Authors

  • Niek Wit

Code

__author__ = "Niek Wit"
__copyright__ = "Copyright 2025, Niek Wit"
__email__ = "niekwit@gmail.com"
__license__ = "MIT"

from pathlib import Path
from snakemake.shell import shell

log = snakemake.log_fmt_shell(stdout=True, stderr=True)

# Get input files
# Let Snakemake handle the error if bams are missing
bam_files = snakemake.input.bams
blacklist = snakemake.input.get("blacklist", "")
if blacklist:
    blacklist = f"--blackListFileName {blacklist}"

# Get/create sample labels (remove .bam extension and dir)
# If no labels are provided, use the basename of the bam file
sample_label = snakemake.params.get("labels", "")
if not sample_label:
    sample_label = [Path(bam).stem for bam in bam_files if Path(bam).suffix == ".bam"]

# Check if the number of labels is equal to the number of bam files
assert len(sample_label) == len(
    bam_files
), "Number of labels must be equal to the number of bam files"

# Check output format
out_format = Path(snakemake.output.hist).suffix
VALID_FORMATS = {".png", ".pdf", ".svg", ".eps", ".plotly"}
if out_format not in VALID_FORMATS:
    raise ValueError(
        f"Invalid output format '{out_format}'. Must be one of: {', '.join(sorted(VALID_FORMATS))}"
    )

# Optional output
out_raw = snakemake.output.get("raw", "")
if out_raw:
    out_raw = f"--outRawFragmentLengths {out_raw}"

# Parameters
extra = snakemake.params.get("extra", "")

shell(
    "bamPEFragmentSize "
    "--numberOfProcessors {snakemake.threads} "
    "-b {bam_files} "
    "-o {snakemake.output.hist} "
    "{blacklist} {out_raw} {extra} {log}"
)