MULTIQC

https://img.shields.io/github/issues-pr/snakemake/snakemake-wrappers/bio/multiqc?label=version%20update%20pull%20requests

Generate qc report using MultiQC.

URL: https://multiqc.info/

Example

This wrapper can be used in the following way:

rule test_multiqc_dir:
    input:
        expand("samtools_stats/{sample}.txt", sample=["a", "b"]),
    output:
        "qc/multiqc.html",
        directory("qc_data/multiqc_data"),
    params:
        extra="--verbose",  # Optional: extra parameters for multiqc.
    log:
        "logs/multiqc.log",
    wrapper:
        "v4.6.0-24-g250dd3e/bio/multiqc"


rule test_multiqc_file:
    input:
        expand("samtools_stats/{sample}.txt", sample=["a"]),
    output:
        "qc/multiqc.a.html",
        "qc_data/multiqc.a_data.zip",
    params:
        extra="--verbose",  # Optional: extra parameters for multiqc.
        use_input_files_only=True,  # Optional, use only a.txt and don't search folder samtools_stats for files
    log:
        "logs/multiqc.log",
    wrapper:
        "v4.6.0-24-g250dd3e/bio/multiqc"


rule test_multiqc_config:
    input:
        expand("samtools_stats/{sample}.txt", sample=["a", "b"]),
        config="config/multiqc_config.yaml",
    output:
        "qc/multiqc.config.html",
        "qc_data/multiqc.config_data.zip",
    params:
        extra="--verbose",
    log:
        "logs/multiqc.log",
    wrapper:
        "v4.6.0-24-g250dd3e/bio/multiqc"

Note that input, output and log file paths can be chosen freely.

When running with

snakemake --use-conda

the software dependencies will be automatically deployed into an isolated environment before execution.

Notes

  • options –data-dir, –no-data-dir, –zip-data-dir, –no-report, and –config are automaticall inferred.

Software dependencies

  • multiqc=1.25.1

  • snakemake-wrapper-utils=0.6.2

Input/Output

Input:

  • input directory containing qc files, default behaviour is to extract folder path from the provided files or parent folder if a folder is provided.

Output:

  • qc report (html)

  • multiqc data folder or zip (optional)

Params

  • extra: additional program arguments.

  • use_input_files_only: if this variable is set to True input will be used as it is, i.e no folder will be extract from provided file names

Authors

  • Julian de Ruiter

  • Filipe G. Vieira

  • Thibault Dayris

Code

"""Snakemake wrapper for MultiQC"""

__author__ = "Julian de Ruiter"
__copyright__ = "Copyright 2017, Julian de Ruiter"
__email__ = "julianderuiter@gmail.com"
__license__ = "MIT"


# No need for explicit temp folder, since MultiQC already uses TMPDIR (https://multiqc.info/docs/usage/troubleshooting/#no-space-left-on-device)

from pathlib import Path
from snakemake.shell import shell
from snakemake_wrapper_utils.snakemake import is_arg


extra = snakemake.params.get("extra", "")
log = snakemake.log_fmt_shell(stdout=True, stderr=True)

# Automatically detect configuration files when provided
# in input. For other ways to provide configuration to
# multiqc, see: https://multiqc.info/docs/getting_started/config/
mqc_config = snakemake.input.get("config", "")
if isinstance(mqc_config, list):
    for fp in mqc_config:
        extra += f" --config {fp}"
elif mqc_config:
    extra += f" --config {mqc_config}"


# Set this to False if multiqc should use the actual input directly
# instead of parsing the folders where the provided files are located
use_input_files_only = snakemake.params.get("use_input_files_only", False)
if not use_input_files_only:
    input_data = set(Path(fp).parent for fp in snakemake.input if fp not in mqc_config)
else:
    input_data = set(fp for fp in snakemake.input if fp not in mqc_config)


# Add extra options depending on output files
no_report = True
for output in snakemake.output:
    if output.endswith(".html"):
        no_report = False
    if output.endswith("_data"):
        extra += " --data-dir"
    if output.endswith(".zip"):
        extra += " --zip-data-dir"
if no_report:
    extra += " --no-report"
if (
    not is_arg("--data-dir", extra)
    and not is_arg("-z", extra)
    and not is_arg("--zip-data-dir", extra)
):
    extra += " --no-data-dir"

# Specify output dir and file name, since they are stored in the JSON file
out_dir = Path(snakemake.output[0]).parent
file_name = Path(snakemake.output[0]).with_suffix("").name


shell(
    "multiqc"
    " {extra}"
    " --outdir {out_dir}"
    " --filename {file_name}"
    " {input_data}"
    " {log}"
)


# Move files to another destination (if needed)
for output in snakemake.output:
    if output.endswith("_data"):
        ext = "_data"
    elif output.endswith(".zip"):
        ext = "_data.zip"
    else:
        ext = Path(output).suffix

    default_dest = f"{out_dir}/{file_name}{ext}"
    if default_dest != output:
        shell("mv {default_dest} {output}")