RAGTAG-PATH
Homology-based assembly patching.
URL: https://github.com/malonge/RagTag/wiki/patch
Example
This wrapper can be used in the following way:
rule patch:
input:
query="fasta/{query}.fasta",
ref="fasta/{reference}.fasta",
output:
agp="{query}_{reference}.agp",
fasta="{query}_{reference}.fasta",
rename_agp="{query}_{reference}.rename.agp",
rename_fasta="{query}_{reference}.rename.fasta",
ctg_agp="{query}_{reference}.ctg.agp",
ctg_fasta="{query}_{reference}.ctg.fasta",
comps_fasta="{query}_{reference}.comps.fasta",
asm_dir=directory("{query}_{reference}_asm"), # Assembly alignment files
params:
extra="",
threads: 16
log:
"logs/ragtag/{query}_patch_{reference}.log",
wrapper:
"v3.5.2/bio/ragtag/patch"
Note that input, output and log file paths can be chosen freely.
When running with
snakemake --use-conda
the software dependencies will be automatically deployed into an isolated environment before execution.
Notes
Multiple threads can be used during Minimap/Unimap alignment.
Software dependencies
ragtag=2.1.0
Input/Output
Input:
ref
: reference fasta file (uncompressed or bgzipped)query
: query fasta file (uncompressed or bgzipped)
Output:
fasta
: The final FASTA file containing the patched assemblyagp
: The final AGP file defining how ragtag.patch.fasta is built.rename_agp
: Optional. An AGP file defining the new names for query sequencesrename_fasta
: Optional. A FASTA file with the original query sequence, but with new names.comps_fasta
: Optional. The split target assembly and the renamed query assembly combined into one FASTA file.ctg_agp
: Optional. An AGP file defining how the target assembly was split at gapsctg_fasta
: Optional. The target assembly split at gapsasm_dir
: Optional. A directory containing Assembly alignment files.
Params
extra
: additional parameters
Code
"""Snakemake wrapper for ragtag-patch."""
__author__ = "Curro Campuzano Jiménez"
__copyright__ = "Copyright 2023, Curro Campuzano Jiménez"
__email__ = "campuzanocurro@gmail.com"
__license__ = "MIT"
import tempfile
from snakemake.shell import shell
log = snakemake.log_fmt_shell(stdout=True, stderr=True)
extra = snakemake.params.get("extra", "")
assert snakemake.output.keys(), "Output must contain at least one named file."
valid_keys = [
"agp",
"fasta",
"rename_agp",
"rename_fasta",
"comps_fasta",
"ctg_agp",
"ctg_fasta",
"asm_dir",
]
for key in snakemake.output.keys():
assert (
key in valid_keys
), "Invalid key in output. Valid keys are: %r. Given: %r." % (valid_keys, key)
with tempfile.TemporaryDirectory() as tmpdir:
shell(
"ragtag.py patch"
" {snakemake.input.ref}"
" {snakemake.input.query}"
" {extra}"
" -o {tmpdir} -t {snakemake.threads}"
" {log}"
)
for key in valid_keys[:-1]:
outfile = snakemake.output.get(key)
if outfile:
extension = key.replace("_", ".")
shell("mv {tmpdir}/ragtag.patch.{extension} {outfile}")
outdir = snakemake.output.get("asm_dir")
if outdir:
# Move files into directory outdir
shell("mkdir -p {outdir} && mv {tmpdir}/ragtag.patch.asm.* {outdir}")