DEEPTOOLS BAMCOVERAGE

https://img.shields.io/github/issues-pr/snakemake/snakemake-wrappers/bio/deeptools/bamcoverage?label=version%20update%20pull%20requests

deepTools bamcoverage takes an alignment of reads or fragments as input (BAM file) and generates a coverage track (bigWig or bedGraph) as output. For more information about deepTools, also see the source code.

URL: https://deeptools.readthedocs.io/en/develop/content/tools/bamCoverage.html?highlight=bamcoverage

Example

This wrapper can be used in the following way:

rule test_deeptools_bamcoverage:
    input:
        bam="a.sorted.bam",
        bai="a.sorted.bam.bai",
        # Optional path to a blacklist bed file
        # blacklist="",
    output:
        "a.coverage.bw",
    params:
        effective_genome_size=1000,
        extra="",
    log:
        "logs/coverage.log",
    wrapper:
        "v5.0.0/bio/deeptools/bamcoverage"


rule test_deeptools_bamcoverage_default_eff_len:
    input:
        bam="a.sorted.bam",
        bai="a.sorted.bam.bai",
    output:
        "a.coverage_code.bw",
    params:
        genome="GRCm38",
        read_length=200,
    log:
        "logs/coverage_efault_eff_len.log",
    wrapper:
        "v5.0.0/bio/deeptools/bamcoverage"


rule test_deeptools_bamcoverage_no_params:
    input:
        bam="a.sorted.bam",
        bai="a.sorted.bam.bai",
    output:
        "a.coverage_no_params.bw",
    log:
        "logs/coverage.log",
    wrapper:
        "v5.0.0/bio/deeptools/bamcoverage"

Note that input, output and log file paths can be chosen freely.

When running with

snakemake --use-conda

the software dependencies will be automatically deployed into an isolated environment before execution.

Software dependencies

  • deeptools=3.5.5

Input/Output

Input:

  • bam: Path to alignment (BAM) file

  • blacklist: Path to optional blacklist region file (BED)

Output:

  • Path to coverage file

Params

  • effective_genome_size: Optional effective genome size value

  • genome: Optional parameter used to fill effective genome size with pre-computed parameters. Can only be one of GRCm37, GRCm38, GRCm39, GRCh37, GRCh38, dm3, dm6, WBcel235, or GRCz10.

  • read_length: Optional parameter used to fill effective genome size with pre-computed parameters. Can only be one of 50, 75, 100, 150, 200, or 250.

  • extra: Optional parameters to be given to deepTools bamcoverage

Authors

  • Thibault Dayris

Code

__author__ = "Thibault Dayris"
__copyright__ = "Copyright 2022, Thibault Dayris"
__email__ = "thibault.dayris@gustaveroussy.fr"
__license__ = "MIT"

from snakemake.shell import shell

log = snakemake.log_fmt_shell(stdout=True, stderr=True)
extra = snakemake.params.get("extra", "")

# See: https://deeptools.readthedocs.io/en/latest/content/feature/effectiveGenomeSize.html
default_effective_genome_size = {
    "GRCh37": {
        "50": 2685511454,
        "75": 2736124898,
        "100": 2776919708,
        "150": 2827436883,
        "200": 2855463800,
        "250": 2855044784,
    },
    "GRCh38": {
        "50": 2701495711,
        "75": 2747877702,
        "100": 2805636231,
        "150": 2862010428,
        "200": 2887553103,
        "250": 2898802627,
    },
    "GRCm37": {
        "50": 2304947876,
        "75": 2404646149,
        "100": 2462480910,
        "150": 2489384085,
        "200": 2513019076,
        "250": 2528988583,
    },
    "GRCm38": {
        "50": 2308125299,
        "75": 2407883243,
        "100": 2467481008,
        "150": 2494787038,
        "200": 2520868989,
        "250": 2538590322,
    },
    "GRCm39": {
        "50": 2309746861,
        "75": 2410055689,
        "100": 2468088461,
        "150": 2495461690,
        "200": 2521902382,
        "250": 2538633971,
    },
    "GRCz10": {
        "50": 1195445541,
        "75": 1251132611,
        "100": 1280188944,
        "150": 1312207019,
        "200": 1321355041,
        "250": 1339205109,
    },
    "GRCz11": {
        "50": 1197575653,
        "75": 1250812288,
        "100": 1280354977,
        "150": 1311832909,
        "200": 1322366338,
        "250": 1342093482,
    },
    "T2T/CHM13CAT_v2": {
        "50": 2725240337,
        "75": 2786136059,
        "100": 2814334875,
        "150": 2931551487,
        "200": 2936403235,
        "250": 2960856300,
    },
    "TAIR10": {
        "50": 114339094,
        "75": 115317469,
        "100": 118459858,
        "150": 118504138,
        "200": 117723393,
        "250": 119585546,
    },
    "WBcel235": {
        "50": 95159402,
        "75": 96945370,
        "100": 98259898,
        "150": 98721103,
        "200": 98672558,
        "250": 101271756,
    },
    "dm3": {
        "50": 130428510,
        "75": 135004387,
        "100": 139647132,
        "150": 144307658,
        "200": 148523810,
        "250": 151901455,
    },
    "dm6": {
        "50": 125464678,
        "75": 127324557,
        "100": 129789773,
        "150": 129940985,
        "200": 132508963,
        "250": 132900923,
    },
}

effective_genome_size = snakemake.params.get("effective_genome_size", "")
if not effective_genome_size:
    genome = snakemake.params.get("genome")
    read_length = snakemake.params.get("read_length")
    if genome and read_length:
        try:
            effective_genome_size = f"--effectiveGenomeSize {default_effective_genome_size[genome][str(read_length)]}"
        except KeyError:
            raise KeyError(
                (
                    "A valid genome and read_length need to be provided. See "
                    "https://deeptools.readthedocs.io/en/latest/content/feature/effectiveGenomeSize.html"
                )
            )
else:
    effective_genome_size = f"--effectiveGenomeSize {effective_genome_size}"


output_format = ""
bigwig_format = ["bw", "bigwig"]
bedgraph_format = ["bg", "bedgraph"]
output_ext = str(snakemake.output[0]).split(".")[-1].lower()
if output_ext in bigwig_format:
    output_format = "bigwig"
elif output_ext in bedgraph_format:
    output_format = "bedgraph"
else:
    raise ValueError("Output file should be either a bigwig or a bedgraph file")


blacklist = snakemake.input.get("blacklist", "")
if blacklist:
    blacklist = "--blackListFileName " + blacklist

shell(
    "bamCoverage "
    "{blacklist} {extra} "
    "--numberOfProcessors {snakemake.threads} "
    "{effective_genome_size} "
    "--bam {snakemake.input.bam} "
    "--outFileName {snakemake.output} "
    "--outFileFormat {output_format} "
    "{log} "
)