DEEPTOOLS BAMPEFRAGMENTSIZE
bamPEFragmentSize calculates the fragment sizes for read pairs in paired-end sequencing BAM files. It generates a histogram of fragment sizes and can output the raw fragment length data. For usage information about deepTools bamPEFragmentSize, please see the [documentation](https://deeptools.readthedocs.io/en/latest/content/tools/bamPEFragmentSize.html). For more information about deepTools, also see the [source code](https://github.com/deeptools/deepTools).
URL: https://deeptools.readthedocs.io/en/latest/content/tools/bamPEFragmentSize.html
Example
This wrapper can be used in the following way:
rule deeptools_bampe_fragmentsize:
input:
# Input BAM file(s)
bams=["a.bam", "b.bam"],
# Optional blacklist file in BED format to exclude specific regions from analysis
# blacklist="",
output:
# Please note that -o/hist/--histogram and --outRawFragmentLengths are exclusively defined via output files.
# Usable output variables, their extensions and which option they implicitly call are listed here:
# https://snakemake-wrappers.readthedocs.io/en/stable/wrappers/deeptools/bamPEFragmentSize.html.
# Required
hist="results/histogram.png",
# Optional output files
raw="results/raw.tab",
log:
"logs/deeptools/bampe_fragmentsize.log",
threads: 4
params:
# Labels can be changed to anything
# If left empty, the sample name will be used
# (without path and .bam extension)
# Format: list matching the number of input BAMs
# Example: ["sample1", "sample2"] or "" for automatic labels
labels="",
# Additional parameters for deeptools bamPEFragmentSize
# Example: --maxFragmentLength 1000 --binSize 10
extra="--logScale",
wrapper:
"v7.0.0/bio/deeptools/bampefragmentsize"
Note that input, output and log file paths can be chosen freely.
When running with
snakemake --use-conda
the software dependencies will be automatically deployed into an isolated environment before execution.
Software dependencies
deeptools=3.5.6
Input/Output
Input:
bams: List of BAM files (.bam)blacklist: Optional BED file with regions to skip (.bed)
Output:
hist: Fragment size histogram (.png)raw: Raw fragment lengths (.tab) (optional)
Params
label: Labels for plotting (list of string or “” for automatic labelling)extra: Optional parameters given to bamPEFragmentSize
Code
__author__ = "Niek Wit"
__copyright__ = "Copyright 2025, Niek Wit"
__email__ = "niekwit@gmail.com"
__license__ = "MIT"
from pathlib import Path
from snakemake.shell import shell
log = snakemake.log_fmt_shell(stdout=True, stderr=True)
# Get input files
# Let Snakemake handle the error if bams are missing
bam_files = snakemake.input.bams
blacklist = snakemake.input.get("blacklist", "")
if blacklist:
blacklist = f"--blackListFileName {blacklist}"
# Get/create sample labels (remove .bam extension and dir)
# If no labels are provided, use the basename of the bam file
sample_label = snakemake.params.get("labels", "")
if not sample_label:
sample_label = [Path(bam).stem for bam in bam_files if Path(bam).suffix == ".bam"]
# Check if the number of labels is equal to the number of bam files
assert len(sample_label) == len(
bam_files
), "Number of labels must be equal to the number of bam files"
# Check output format
out_format = Path(snakemake.output.hist).suffix
VALID_FORMATS = {".png", ".pdf", ".svg", ".eps", ".plotly"}
if out_format not in VALID_FORMATS:
raise ValueError(
f"Invalid output format '{out_format}'. Must be one of: {', '.join(sorted(VALID_FORMATS))}"
)
# Optional output
out_raw = snakemake.output.get("raw", "")
if out_raw:
out_raw = f"--outRawFragmentLengths {out_raw}"
# Parameters
extra = snakemake.params.get("extra", "")
shell(
"bamPEFragmentSize "
"--numberOfProcessors {snakemake.threads} "
"-b {bam_files} "
"-o {snakemake.output.hist} "
"{blacklist} {out_raw} {extra} {log}"
)