DEEPTOOLS PLOTFINGERPRINT¶
deepTools plotFingerprint
plots a profile of cumulative read coverages from a list of indexed BAM files. For usage information about deepTools plotFingerprint
, please see the documentation. For more information about deepTools
, also see the source code.
In addition to required output, an optional output file of read counts can be generated by setting the output variable “counts” (see example Snakemake rule below). Also an optional output file of quality control metrics can be generated by setting the variable “qc_metrics”. If the jsd_sample is specified in the input, the results of the Jensen-Shannon distance calculation are also written to this file.
plotFingerprint option Output Name of output
variable to be used
Recommended
extension(s)
–plotFile, -plot, -o coverage plot fingerprint
(required)
“.png” or
“.eps” or
“.pdf” or
“.svg”
–outRawCounts tab-separated table of read
counts per bin
counts “.tab” –outQualityMetrics tab-separated table of metrics
for quality control and for
results of Jensen-Shannon
distance calculation (optional)
metrics “.txt”
Example¶
This wrapper can be used in the following way:
rule plot_fingerprint:
input:
bam_files=expand("samples/{sample}.bam", sample=["a", "b"]),
bam_idx=expand("samples/{sample}.bam.bai", sample=["a", "b"]),
jsd_sample="samples/b.bam" # optional, requires qc_metrics output
output:
# Please note that --plotFile and --outRawCounts are exclusively defined via output files.
# Usable output variables, their extensions and which option they implicitly call are listed here:
# https://snakemake-wrappers.readthedocs.io/en/stable/wrappers/deeptools/plotfingerprint.html.
fingerprint="plot_fingerprint/plot_fingerprint.png", # required
# optional output
counts="plot_fingerprint/raw_counts.tab",
qc_metrics="plot_fingerprint/qc_metrics.txt"
log:
"logs/deeptools/plot_fingerprint.log"
params:
# optional parameters
"--numberOfSamples 200 "
threads:
8
wrapper:
"v2.6.0/bio/deeptools/plotfingerprint"
Note that input, output and log file paths can be chosen freely.
When running with
snakemake --use-conda
the software dependencies will be automatically deployed into an isolated environment before execution.
Software dependencies¶
deeptools=3.5.2
Input/Output¶
Input:
- list of BAM files (.bam) AND
- list of their index files (.bam.bai)
Output:
- plot file in image format (.png, .eps, .pdf or .svg)
- tab-separated table of read counts per bin (.tab) (optional)
- tab-separated table of metrics and JSD calculation (.txt) (optional)
Authors¶
- Antonie Vietor
Code¶
__author__ = "Antonie Vietor"
__copyright__ = "Copyright 2020, Antonie Vietor"
__email__ = "antonie.v@gmx.de"
__license__ = "MIT"
from snakemake.shell import shell
import re
log = snakemake.log_fmt_shell(stdout=True, stderr=True)
jsd_sample = snakemake.input.get("jsd_sample")
out_counts = snakemake.output.get("counts")
out_metrics = snakemake.output.get("qc_metrics")
optional_output = ""
jsd = ""
if jsd_sample:
jsd += " --JSDsample {jsd} ".format(jsd=jsd_sample)
if out_counts:
optional_output += " --outRawCounts {out_counts} ".format(out_counts=out_counts)
if out_metrics:
optional_output += " --outQualityMetrics {metrics} ".format(metrics=out_metrics)
shell(
"(plotFingerprint "
"-b {snakemake.input.bam_files} "
"-o {snakemake.output.fingerprint} "
"{optional_output} "
"--numberOfProcessors {snakemake.threads} "
"{jsd} "
"{snakemake.params}) {log}"
)
# ToDo: remove the 'NA' string replacement when fixed in deepTools, see:
# https://github.com/deeptools/deepTools/pull/999
regex_passes = 2
with open(out_metrics, "rt") as f:
metrics = f.read()
for i in range(regex_passes):
metrics = re.sub("\tNA(\t|\n)", "\tnan\\1", metrics)
with open(out_metrics, "wt") as f:
f.write(metrics)