HIFIASM
A haplotype-resolved assembler for accurate Hifi reads
URL: https://github.com/chhylp123/hifiasm
Example
This wrapper can be used in the following way:
rule hifiasm:
input:
fasta=[
"reads/HiFi_dataset_01.fasta.gz",
"reads/HiFi_dataset_02.fasta.gz",
],
# optional
# hic1="reads/Hi-C_dataset_R1.fastq.gz",
# hic2="reads/Hi-C_dataset_R2.fastq.gz",
output:
multiext(
"hifiasm/{sample}.",
"a_ctg.gfa",
"a_ctg.lowQ.bed",
"a_ctg.noseq.gfa",
"p_ctg.gfa",
"p_ctg.lowQ.bed",
"p_ctg.noseq.gfa",
"p_utg.gfa",
"p_utg.lowQ.bed",
"p_utg.noseq.gfa",
"r_utg.gfa",
"r_utg.lowQ.bed",
"r_utg.noseq.gfa",
),
log:
"logs/hifiasm/{sample}.log",
params:
extra="--primary -f 37 -l 1 -s 0.75 -O 1",
threads: 2
resources:
mem_mb=1024,
wrapper:
"v5.8.0-3-g915ba34/bio/hifiasm"
Note that input, output and log file paths can be chosen freely.
When running with
snakemake --use-conda
the software dependencies will be automatically deployed into an isolated environment before execution.
Notes
The extra param allows for additional program arguments.
Software dependencies
hifiasm=0.24.0
Input/Output
Input:
PacBio HiFi reads (fasta)
Hi-C reads (fastq; optional)
Output:
assembly graphs (GFA)
Code
__author__ = "Filipe G. Vieira"
__copyright__ = "Copyright 2022, Filipe G. Vieira"
__license__ = "MIT"
import os
from snakemake.shell import shell
log = snakemake.log_fmt_shell()
extra = snakemake.params.get("extra", "")
hic1 = snakemake.input.get("hic1", "")
if hic1:
if isinstance(hic1, list):
hic1 = ",".join(hic1)
hic1 = "--h1 {}".format(hic1)
hic2 = snakemake.input.get("hic2", "")
if hic2:
if isinstance(hic2, list):
hic2 = ",".join(hic2)
hic2 = "--h2 {}".format(hic2)
out_prefix = os.path.commonprefix(snakemake.output).rstrip(".")
shell(
"hifiasm"
" -t {snakemake.threads}"
" {extra}"
" {hic1} {hic2}"
" -o {out_prefix}"
" {snakemake.input.fasta}"
" {log}"
)