QUAST

Quality Assessment Tool for Genome Assemblies

URL: https://github.com/ablab/quast

Example

This wrapper can be used in the following way:

rule quast:
    input:
        fasta="genome.fasta",
        ref="genome.fasta",
        #gff="annotations.gff",
        #pe1="reads_R1.fastq",
        #pe2="reads_R2.fastq",
        #pe12="reads.fastq",
        #mp1="matereads_R1.fastq",
        #mp2="matereads_R2.fastq",
        #mp12="matereads.fastq",
        #single="single.fastq",
        #pacbio="pacbio.fas",
        #nanopore="nanopore.fastq",
        #ref_bam="ref.bam",
        #ref_sam="ref.sam",
        #bam=["s1.bam","s2.bam"],
        #sam=["s1.sam","s2.sam"],
        #sv_bedpe="sv.bed",
    output:
        multiext("{sample}/report.", "html", "tex", "txt", "pdf", "tsv"),
        multiext("{sample}/transposed_report.", "tex", "txt", "tsv"),
        multiext(
            "{sample}/basic_stats/",
            "cumulative_plot.pdf",
            "GC_content_plot.pdf",
            "gc.icarus.txt",
            "genome_GC_content_plot.pdf",
            "NGx_plot.pdf",
            "Nx_plot.pdf",
        ),
        multiext(
            "{sample}/contigs_reports/",
            "all_alignments_genome.tsv",
            "contigs_report_genome.mis_contigs.info",
            "contigs_report_genome.stderr",
            "contigs_report_genome.stdout",
        ),
        "{sample}/contigs_reports/minimap_output/genome.coords_tmp",
        "{sample}/icarus.html",
        "{sample}/icarus_viewers/contig_size_viewer.html",
        "{sample}/quast.log",
    log:
        "logs/{sample}.quast.log",
    params:
        extra="--min-contig 5 --min-identity 95.0",
    wrapper:
        "v1.15.2/bio/quast"

Note that input, output and log file paths can be chosen freely.

When running with

snakemake --use-conda

the software dependencies will be automatically deployed into an isolated environment before execution.

Notes

  • The extra param allows for additional program arguments.

Software dependencies

  • quast=5.2

Input/Output

Input:

  • Sequences in FASTA format
  • Reference genome (optional)
  • GFF (optional)
  • Paired end read (optional)
  • Mate-pair reads (optional)
  • Unpaired reads (optional)
  • PacBio SMRT reads (optional)
  • Oxford Nanopore reads (optional)
  • Mapped reads against the reference in SAM/BAM (optional)
  • Mapped reads against each of the assemblies in SAM/BAM (same order; optional)
  • Structural variants in BEDPE (optional)

Output:

  • Assessment summary in plain text format
  • Tab-separated version of the summary
  • LaTeX version of the summary
  • Icarus main menu with links to interactive viewers
  • PDF report of all plots combined with all tables
  • HTML version of the report with interactive plots inside
  • Report on misassemblies
  • Report on unaligned and partially unaligned contigs
  • Report on k-mer-based metrics
  • Report on mapped reads statistics.

Authors

  • Filipe G. Vieira

Code

__author__ = "Filipe G. Vieira"
__copyright__ = "Copyright 2022, Filipe G. Vieira"
__license__ = "MIT"


import os
from snakemake.shell import shell


extra = snakemake.params.get("extra", "")
log = snakemake.log_fmt_shell(stdout=True, stderr=True)


ref = snakemake.input.get("ref", "")
if ref:
    ref = f"-r {ref}"

gff = snakemake.input.get("gff", "")
if gff:
    gff = f"--features {gff}"

pe1 = snakemake.input.get("pe1", "")
if pe1:
    pe1 = f"--pe1 {pe1}"
pe2 = snakemake.input.get("pe2", "")
if pe2:
    pe2 = f"--pe2 {pe2}"
pe12 = snakemake.input.get("pe12", "")
if pe12:
    pe12 = f"--pe12 {pe12}"
mp1 = snakemake.input.get("mp1", "")
if mp1:
    mp1 = f"--mp1 {mp1}"
mp2 = snakemake.input.get("mp2", "")
if mp2:
    mp2 = f"--mp2 {mp2}"
mp12 = snakemake.input.get("mp12", "")
if mp12:
    mp12 = f"--mp12 {mp12}"
single = snakemake.input.get("single", "")
if single:
    single = f"--single {single}"
pacbio = snakemake.input.get("pacbio", "")
if pacbio:
    pacbio = f"--pacbio {pacbio}"
nanopore = snakemake.input.get("nanopore", "")
if nanopore:
    nanopore = f"--nanopore {nanopore}"
ref_bam = snakemake.input.get("ref_bam", "")
if ref_bam:
    ref_bam = f"--ref-bam {ref_bam}"
ref_sam = snakemake.input.get("ref_sam", "")
if ref_sam:
    ref_sam = f"--ref-sam {ref_sam}"
bam = snakemake.input.get("bam", "")
if bam:
    if isinstance(bam, list):
        bam = ",".join(bam)
    bam = f"--bam {bam}"
sam = snakemake.input.get("sam", "")
if sam:
    if isinstance(sam, list):
        sam = ",".join(sam)
    sam = f"--sam {sam}"
sv_bedpe = snakemake.input.get("sv_bedpe", "")
if sv_bedpe:
    sv_bedpe = f"--sv-bedpe {sv_bedpe}"


output_dir = os.path.commonpath(snakemake.output)


shell(
    "quast --threads {snakemake.threads} {ref} {gff} {pe1} {pe2} {pe12} {mp1} {mp2} {mp12} {single} {pacbio} {nanopore} {ref_bam} {ref_sam} {bam} {sam} {sv_bedpe} {extra} -o {output_dir} {snakemake.input.fasta} {log}"
)