DEEPTOOLS PLOTFINGERPRINT

deepTools plotFingerprint plots a profile of cumulative read coverages from a list of indexed BAM files. For usage information about deepTools plotFingerprint, please see the documentation. For more information about deepTools, also see the source code.

In addition to required output, an optional output file of read counts can be generated by setting the output variable “counts” (see example Snakemake rule below). Also an optional output file of quality control metrics can be generated by setting the variable “qc_metrics”. If the jsd_sample is specified in the input, the results of the Jensen-Shannon distance calculation are also written to this file.

plotFingerprint option Output

Name of output

variable to be used

Recommended

extension(s)

–plotFile, -plot, -o coverage plot

fingerprint

(required)

“.png” or

“.eps” or

“.pdf” or

“.svg”

–outRawCounts

tab-separated table of read

counts per bin

counts “.tab”
–outQualityMetrics

tab-separated table of metrics

for quality control and for

results of Jensen-Shannon

distance calculation (optional)

metrics “.txt”

Example

This wrapper can be used in the following way:

rule plot_fingerprint:
    input:
        bam_files=expand("samples/{sample}.bam", sample=["a", "b"]),
        bam_idx=expand("samples/{sample}.bam.bai", sample=["a", "b"]),
        jsd_sample="samples/b.bam" # optional, requires qc_metrics output
    output:
        # Please note that --plotFile and --outRawCounts are exclusively defined via output files.
        # Usable output variables, their extensions and which option they implicitly call are listed here:
        #         https://snakemake-wrappers.readthedocs.io/en/stable/wrappers/deeptools/plotfingerprint.html.
        fingerprint="plot_fingerprint/plot_fingerprint.png",  # required
        # optional output
        counts="plot_fingerprint/raw_counts.tab",
        qc_metrics="plot_fingerprint/qc_metrics.txt"
    log:
        "logs/deeptools/plot_fingerprint.log"
    params:
        # optional parameters
        "--numberOfSamples 200 "
    threads:
        8
    wrapper:
        "0.72.0/bio/deeptools/plotfingerprint"

Note that input, output and log file paths can be chosen freely. When running with

snakemake --use-conda

the software dependencies will be automatically deployed into an isolated environment before execution.

Software dependencies

  • deeptools==3.4.3

Input/Output

Input:

  • list of BAM files (.bam) AND
  • list of their index files (.bam.bai)

Output:

  • plot file in image format (.png, .eps, .pdf or .svg)
  • tab-separated table of read counts per bin (.tab) (optional)
  • tab-separated table of metrics and JSD calculation (.txt) (optional)

Authors

  • Antonie Vietor

Code

__author__ = "Antonie Vietor"
__copyright__ = "Copyright 2020, Antonie Vietor"
__email__ = "antonie.v@gmx.de"
__license__ = "MIT"

from snakemake.shell import shell
import re

log = snakemake.log_fmt_shell(stdout=True, stderr=True)

jsd_sample = snakemake.input.get("jsd_sample")
out_counts = snakemake.output.get("counts")
out_metrics = snakemake.output.get("qc_metrics")
optional_output = ""
jsd = ""

if jsd_sample:
    jsd += " --JSDsample {jsd} ".format(jsd=jsd_sample)

if out_counts:
    optional_output += " --outRawCounts {out_counts} ".format(out_counts=out_counts)

if out_metrics:
    optional_output += " --outQualityMetrics {metrics} ".format(metrics=out_metrics)

shell(
    "(plotFingerprint "
    "-b {snakemake.input.bam_files} "
    "-o {snakemake.output.fingerprint} "
    "{optional_output} "
    "--numberOfProcessors {snakemake.threads} "
    "{jsd} "
    "{snakemake.params}) {log}"
)
# ToDo: remove the 'NA' string replacement when fixed in deepTools, see:
# https://github.com/deeptools/deepTools/pull/999
regex_passes = 2

with open(out_metrics, "rt") as f:
    metrics = f.read()
    for i in range(regex_passes):
        metrics = re.sub("\tNA(\t|\n)", "\tnan\\1", metrics)

with open(out_metrics, "wt") as f:
    f.write(metrics)