MULTIQC
Generate qc report using MultiQC.
Example
This wrapper can be used in the following way:
rule test_multiqc_dir:
input:
expand("samtools_stats/{sample}.txt", sample=["a", "b"]),
output:
"qc/multiqc.html",
directory("qc_data/multiqc_data"),
params:
extra="--verbose", # Optional: extra parameters for multiqc.
log:
"logs/multiqc.log",
wrapper:
"v4.6.0-24-g250dd3e/bio/multiqc"
rule test_multiqc_file:
input:
expand("samtools_stats/{sample}.txt", sample=["a"]),
output:
"qc/multiqc.a.html",
"qc_data/multiqc.a_data.zip",
params:
extra="--verbose", # Optional: extra parameters for multiqc.
use_input_files_only=True, # Optional, use only a.txt and don't search folder samtools_stats for files
log:
"logs/multiqc.log",
wrapper:
"v4.6.0-24-g250dd3e/bio/multiqc"
rule test_multiqc_config:
input:
expand("samtools_stats/{sample}.txt", sample=["a", "b"]),
config="config/multiqc_config.yaml",
output:
"qc/multiqc.config.html",
"qc_data/multiqc.config_data.zip",
params:
extra="--verbose",
log:
"logs/multiqc.log",
wrapper:
"v4.6.0-24-g250dd3e/bio/multiqc"
Note that input, output and log file paths can be chosen freely.
When running with
snakemake --use-conda
the software dependencies will be automatically deployed into an isolated environment before execution.
Notes
options –data-dir, –no-data-dir, –zip-data-dir, –no-report, and –config are automaticall inferred.
Software dependencies
multiqc=1.25.1
snakemake-wrapper-utils=0.6.2
Input/Output
Input:
input directory containing qc files, default behaviour is to extract folder path from the provided files or parent folder if a folder is provided.
Output:
qc report (html)
multiqc data folder or zip (optional)
Params
extra
: additional program arguments.use_input_files_only
: if this variable is set to True input will be used as it is, i.e no folder will be extract from provided file names
Code
"""Snakemake wrapper for MultiQC"""
__author__ = "Julian de Ruiter"
__copyright__ = "Copyright 2017, Julian de Ruiter"
__email__ = "julianderuiter@gmail.com"
__license__ = "MIT"
# No need for explicit temp folder, since MultiQC already uses TMPDIR (https://multiqc.info/docs/usage/troubleshooting/#no-space-left-on-device)
from pathlib import Path
from snakemake.shell import shell
from snakemake_wrapper_utils.snakemake import is_arg
extra = snakemake.params.get("extra", "")
log = snakemake.log_fmt_shell(stdout=True, stderr=True)
# Automatically detect configuration files when provided
# in input. For other ways to provide configuration to
# multiqc, see: https://multiqc.info/docs/getting_started/config/
mqc_config = snakemake.input.get("config", "")
if isinstance(mqc_config, list):
for fp in mqc_config:
extra += f" --config {fp}"
elif mqc_config:
extra += f" --config {mqc_config}"
# Set this to False if multiqc should use the actual input directly
# instead of parsing the folders where the provided files are located
use_input_files_only = snakemake.params.get("use_input_files_only", False)
if not use_input_files_only:
input_data = set(Path(fp).parent for fp in snakemake.input if fp not in mqc_config)
else:
input_data = set(fp for fp in snakemake.input if fp not in mqc_config)
# Add extra options depending on output files
no_report = True
for output in snakemake.output:
if output.endswith(".html"):
no_report = False
if output.endswith("_data"):
extra += " --data-dir"
if output.endswith(".zip"):
extra += " --zip-data-dir"
if no_report:
extra += " --no-report"
if (
not is_arg("--data-dir", extra)
and not is_arg("-z", extra)
and not is_arg("--zip-data-dir", extra)
):
extra += " --no-data-dir"
# Specify output dir and file name, since they are stored in the JSON file
out_dir = Path(snakemake.output[0]).parent
file_name = Path(snakemake.output[0]).with_suffix("").name
shell(
"multiqc"
" {extra}"
" --outdir {out_dir}"
" --filename {file_name}"
" {input_data}"
" {log}"
)
# Move files to another destination (if needed)
for output in snakemake.output:
if output.endswith("_data"):
ext = "_data"
elif output.endswith(".zip"):
ext = "_data.zip"
else:
ext = Path(output).suffix
default_dest = f"{out_dir}/{file_name}{ext}"
if default_dest != output:
shell("mv {default_dest} {output}")