QUAST

https://img.shields.io/github/issues-pr/snakemake/snakemake-wrappers/bio/quast?label=version%20update%20pull%20requests

Quality Assessment Tool for Genome Assemblies

URL: https://github.com/ablab/quast

Example

This wrapper can be used in the following way:

rule quast:
    input:
        fasta="genome.fasta",
        ref="genome.fasta",
        #gff="annotations.gff",
        #pe1="reads_R1.fastq",
        #pe2="reads_R2.fastq",
        #pe12="reads.fastq",
        #mp1="matereads_R1.fastq",
        #mp2="matereads_R2.fastq",
        #mp12="matereads.fastq",
        #single="single.fastq",
        #pacbio="pacbio.fas",
        #nanopore="nanopore.fastq",
        #ref_bam="ref.bam",
        #ref_sam="ref.sam",
        #bam=["s1.bam","s2.bam"],
        #sam=["s1.sam","s2.sam"],
        #sv_bedpe="sv.bed",
    output:
        report_html="{sample}/report.html",
        report_tex="{sample}/report.tex",
        report_txt="{sample}/report.txt",
        report_pdf="{sample}/report.pdf",
        report_tsv="{sample}/report.tsv",
        treport_tex="{sample}/treport.tex",
        treport_txt="{sample}/treport.txt",
        treport_tsv="{sample}/treport.tsv",
        stats_cum="{sample}/stats/cumulative.pdf",
        stats_gc_plot="{sample}/stats/gc.pdf",
        stats_gc_icarus="{sample}/stats/gc.icarus.txt",
        stats_gc_fasta="{sample}/stats/gc_fasta.pdf",
        stats_ngx="{sample}/stats/NGx.pdf",
        stats_nx="{sample}/stats/Nx.pdf",
        contigs="{sample}/contigs.all_alignments.tsv",
        contigs_mis="{sample}/contigs.mis_contigs.info",
        icarus="{sample}/icarus.html",
        icarus_viewer="{sample}/icarus_viewer.html",
    log:
        "logs/{sample}.log",
    params:
        extra="--min-contig 5 --min-identity 95.0",
    wrapper:
        "v5.0.1/bio/quast"

Note that input, output and log file paths can be chosen freely.

When running with

snakemake --use-conda

the software dependencies will be automatically deployed into an isolated environment before execution.

Notes

  • The extra param allows for additional program arguments.

Software dependencies

  • quast=5.2.0

Input/Output

Input:

  • Sequences in FASTA format

  • Reference genome (optional)

  • GFF (optional)

  • Paired end read (optional)

  • Mate-pair reads (optional)

  • Unpaired reads (optional)

  • PacBio SMRT reads (optional)

  • Oxford Nanopore reads (optional)

  • Mapped reads against the reference in SAM/BAM (optional)

  • Mapped reads against each of the assemblies in SAM/BAM (same order; optional)

  • Structural variants in BEDPE (optional)

Output:

  • Assessment summary in plain text format

  • Tab-separated version of the summary

  • LaTeX version of the summary

  • Icarus main menu with links to interactive viewers

  • PDF report of all plots combined with all tables

  • HTML version of the report with interactive plots inside

  • Report on misassemblies

  • Report on unaligned and partially unaligned contigs

  • Report on k-mer-based metrics

  • Report on mapped reads statistics.

Authors

  • Filipe G. Vieira

Code

__author__ = "Filipe G. Vieira"
__copyright__ = "Copyright 2022, Filipe G. Vieira"
__license__ = "MIT"


import tempfile
from pathlib import Path
from snakemake.shell import shell


extra = snakemake.params.get("extra", "")
log = snakemake.log_fmt_shell(stdout=True, stderr=True)


ref = snakemake.input.get("ref", "")
if ref:
    ref = f"-r {ref}"

gff = snakemake.input.get("gff", "")
if gff:
    gff = f"--features {gff}"

pe1 = snakemake.input.get("pe1", "")
if pe1:
    pe1 = f"--pe1 {pe1}"
pe2 = snakemake.input.get("pe2", "")
if pe2:
    pe2 = f"--pe2 {pe2}"
pe12 = snakemake.input.get("pe12", "")
if pe12:
    pe12 = f"--pe12 {pe12}"
mp1 = snakemake.input.get("mp1", "")
if mp1:
    mp1 = f"--mp1 {mp1}"
mp2 = snakemake.input.get("mp2", "")
if mp2:
    mp2 = f"--mp2 {mp2}"
mp12 = snakemake.input.get("mp12", "")
if mp12:
    mp12 = f"--mp12 {mp12}"
single = snakemake.input.get("single", "")
if single:
    single = f"--single {single}"
pacbio = snakemake.input.get("pacbio", "")
if pacbio:
    pacbio = f"--pacbio {pacbio}"
nanopore = snakemake.input.get("nanopore", "")
if nanopore:
    nanopore = f"--nanopore {nanopore}"
ref_bam = snakemake.input.get("ref_bam", "")
if ref_bam:
    ref_bam = f"--ref-bam {ref_bam}"
ref_sam = snakemake.input.get("ref_sam", "")
if ref_sam:
    ref_sam = f"--ref-sam {ref_sam}"
bam = snakemake.input.get("bam", "")
if bam:
    if isinstance(bam, list):
        bam = ",".join(bam)
    bam = f"--bam {bam}"
sam = snakemake.input.get("sam", "")
if sam:
    if isinstance(sam, list):
        sam = ",".join(sam)
    sam = f"--sam {sam}"
sv_bedpe = snakemake.input.get("sv_bedpe", "")
if sv_bedpe:
    sv_bedpe = f"--sv-bedpe {sv_bedpe}"


with tempfile.TemporaryDirectory() as tmpdir:
    shell(
        "quast --threads {snakemake.threads} {ref} {gff} {pe1} {pe2} {pe12} {mp1} {mp2} {mp12} {single} {pacbio} {nanopore} {ref_bam} {ref_sam} {bam} {sam} {sv_bedpe} {extra} -o {tmpdir} {snakemake.input.fasta} {log}"
    )

    fasta_name = Path(snakemake.input.fasta).with_suffix("").name

    ### Copy files to final destination
    def save_output(src, dst, wd=Path(".")):
        if not dst:
            return 0
        dest = wd / dst
        shell("cat {src} > {dest}")

    ### Saving OUTPUT files
    # Report files
    for ext in ["html", "pdf", "tex", "txt", "tsv"]:
        save_output(f"{tmpdir}/report." + ext, snakemake.output.get(f"report_{ext}"))
        save_output(
            f"{tmpdir}/transposed_report." + ext, snakemake.output.get(f"treport_{ext}")
        )
    # Stats files
    save_output(
        f"{tmpdir}/basic_stats/cumulative_plot.pdf", snakemake.output.get("stats_cum")
    )
    save_output(
        f"{tmpdir}/basic_stats/GC_content_plot.pdf",
        snakemake.output.get("stats_gc_plot"),
    )
    save_output(
        f"{tmpdir}/basic_stats/gc.icarus.txt", snakemake.output.get("stats_gc_icarus")
    )
    save_output(
        f"{tmpdir}/basic_stats/{fasta_name}_GC_content_plot.pdf",
        snakemake.output.get("stats_gc_fasta"),
    )
    save_output(f"{tmpdir}/basic_stats/NGx_plot.pdf", snakemake.output.get("stats_ngx"))
    save_output(f"{tmpdir}/basic_stats/Nx_plot.pdf", snakemake.output.get("stats_nx"))
    # Contig reports
    save_output(
        f"{tmpdir}/contigs_reports/all_alignments_{fasta_name}.tsv",
        snakemake.output.get("contigs"),
    )
    save_output(
        f"{tmpdir}/contigs_reports/contigs_report_{fasta_name}.mis_contigs.info",
        snakemake.output.get("contigs_mis"),
    )
    # Icarus
    save_output(f"{tmpdir}/icarus.html", snakemake.output.get("icarus"))
    save_output(
        f"{tmpdir}/icarus_viewers/contig_size_viewer.html",
        snakemake.output.get("icarus_viewer"),
    )