QUAST
Quality Assessment Tool for Genome Assemblies
URL: https://github.com/ablab/quast
Example
This wrapper can be used in the following way:
rule quast:
input:
fasta="genome.fasta",
ref="genome.fasta",
#gff="annotations.gff",
#pe1="reads_R1.fastq",
#pe2="reads_R2.fastq",
#pe12="reads.fastq",
#mp1="matereads_R1.fastq",
#mp2="matereads_R2.fastq",
#mp12="matereads.fastq",
#single="single.fastq",
#pacbio="pacbio.fas",
#nanopore="nanopore.fastq",
#ref_bam="ref.bam",
#ref_sam="ref.sam",
#bam=["s1.bam","s2.bam"],
#sam=["s1.sam","s2.sam"],
#sv_bedpe="sv.bed",
output:
report_html="{sample}/report.html",
report_tex="{sample}/report.tex",
report_txt="{sample}/report.txt",
report_pdf="{sample}/report.pdf",
report_tsv="{sample}/report.tsv",
treport_tex="{sample}/treport.tex",
treport_txt="{sample}/treport.txt",
treport_tsv="{sample}/treport.tsv",
stats_cum="{sample}/stats/cumulative.pdf",
stats_gc_plot="{sample}/stats/gc.pdf",
stats_gc_icarus="{sample}/stats/gc.icarus.txt",
stats_gc_fasta="{sample}/stats/gc_fasta.pdf",
stats_ngx="{sample}/stats/NGx.pdf",
stats_nx="{sample}/stats/Nx.pdf",
contigs="{sample}/contigs.all_alignments.tsv",
contigs_mis="{sample}/contigs.mis_contigs.info",
icarus="{sample}/icarus.html",
icarus_viewer="{sample}/icarus_viewer.html",
log:
"logs/{sample}.log",
params:
extra="--min-contig 5 --min-identity 95.0",
wrapper:
"v5.0.1/bio/quast"
Note that input, output and log file paths can be chosen freely.
When running with
snakemake --use-conda
the software dependencies will be automatically deployed into an isolated environment before execution.
Notes
The extra param allows for additional program arguments.
Software dependencies
quast=5.2.0
Input/Output
Input:
Sequences in FASTA format
Reference genome (optional)
GFF (optional)
Paired end read (optional)
Mate-pair reads (optional)
Unpaired reads (optional)
PacBio SMRT reads (optional)
Oxford Nanopore reads (optional)
Mapped reads against the reference in SAM/BAM (optional)
Mapped reads against each of the assemblies in SAM/BAM (same order; optional)
Structural variants in BEDPE (optional)
Output:
Assessment summary in plain text format
Tab-separated version of the summary
LaTeX version of the summary
Icarus main menu with links to interactive viewers
PDF report of all plots combined with all tables
HTML version of the report with interactive plots inside
Report on misassemblies
Report on unaligned and partially unaligned contigs
Report on k-mer-based metrics
Report on mapped reads statistics.
Code
__author__ = "Filipe G. Vieira"
__copyright__ = "Copyright 2022, Filipe G. Vieira"
__license__ = "MIT"
import tempfile
from pathlib import Path
from snakemake.shell import shell
extra = snakemake.params.get("extra", "")
log = snakemake.log_fmt_shell(stdout=True, stderr=True)
ref = snakemake.input.get("ref", "")
if ref:
ref = f"-r {ref}"
gff = snakemake.input.get("gff", "")
if gff:
gff = f"--features {gff}"
pe1 = snakemake.input.get("pe1", "")
if pe1:
pe1 = f"--pe1 {pe1}"
pe2 = snakemake.input.get("pe2", "")
if pe2:
pe2 = f"--pe2 {pe2}"
pe12 = snakemake.input.get("pe12", "")
if pe12:
pe12 = f"--pe12 {pe12}"
mp1 = snakemake.input.get("mp1", "")
if mp1:
mp1 = f"--mp1 {mp1}"
mp2 = snakemake.input.get("mp2", "")
if mp2:
mp2 = f"--mp2 {mp2}"
mp12 = snakemake.input.get("mp12", "")
if mp12:
mp12 = f"--mp12 {mp12}"
single = snakemake.input.get("single", "")
if single:
single = f"--single {single}"
pacbio = snakemake.input.get("pacbio", "")
if pacbio:
pacbio = f"--pacbio {pacbio}"
nanopore = snakemake.input.get("nanopore", "")
if nanopore:
nanopore = f"--nanopore {nanopore}"
ref_bam = snakemake.input.get("ref_bam", "")
if ref_bam:
ref_bam = f"--ref-bam {ref_bam}"
ref_sam = snakemake.input.get("ref_sam", "")
if ref_sam:
ref_sam = f"--ref-sam {ref_sam}"
bam = snakemake.input.get("bam", "")
if bam:
if isinstance(bam, list):
bam = ",".join(bam)
bam = f"--bam {bam}"
sam = snakemake.input.get("sam", "")
if sam:
if isinstance(sam, list):
sam = ",".join(sam)
sam = f"--sam {sam}"
sv_bedpe = snakemake.input.get("sv_bedpe", "")
if sv_bedpe:
sv_bedpe = f"--sv-bedpe {sv_bedpe}"
with tempfile.TemporaryDirectory() as tmpdir:
shell(
"quast --threads {snakemake.threads} {ref} {gff} {pe1} {pe2} {pe12} {mp1} {mp2} {mp12} {single} {pacbio} {nanopore} {ref_bam} {ref_sam} {bam} {sam} {sv_bedpe} {extra} -o {tmpdir} {snakemake.input.fasta} {log}"
)
fasta_name = Path(snakemake.input.fasta).with_suffix("").name
### Copy files to final destination
def save_output(src, dst, wd=Path(".")):
if not dst:
return 0
dest = wd / dst
shell("cat {src} > {dest}")
### Saving OUTPUT files
# Report files
for ext in ["html", "pdf", "tex", "txt", "tsv"]:
save_output(f"{tmpdir}/report." + ext, snakemake.output.get(f"report_{ext}"))
save_output(
f"{tmpdir}/transposed_report." + ext, snakemake.output.get(f"treport_{ext}")
)
# Stats files
save_output(
f"{tmpdir}/basic_stats/cumulative_plot.pdf", snakemake.output.get("stats_cum")
)
save_output(
f"{tmpdir}/basic_stats/GC_content_plot.pdf",
snakemake.output.get("stats_gc_plot"),
)
save_output(
f"{tmpdir}/basic_stats/gc.icarus.txt", snakemake.output.get("stats_gc_icarus")
)
save_output(
f"{tmpdir}/basic_stats/{fasta_name}_GC_content_plot.pdf",
snakemake.output.get("stats_gc_fasta"),
)
save_output(f"{tmpdir}/basic_stats/NGx_plot.pdf", snakemake.output.get("stats_ngx"))
save_output(f"{tmpdir}/basic_stats/Nx_plot.pdf", snakemake.output.get("stats_nx"))
# Contig reports
save_output(
f"{tmpdir}/contigs_reports/all_alignments_{fasta_name}.tsv",
snakemake.output.get("contigs"),
)
save_output(
f"{tmpdir}/contigs_reports/contigs_report_{fasta_name}.mis_contigs.info",
snakemake.output.get("contigs_mis"),
)
# Icarus
save_output(f"{tmpdir}/icarus.html", snakemake.output.get("icarus"))
save_output(
f"{tmpdir}/icarus_viewers/contig_size_viewer.html",
snakemake.output.get("icarus_viewer"),
)