RAGTAG-MERGE
Scaffold merging.
URL: https://github.com/malonge/RagTag/wiki/merge
Example
This wrapper can be used in the following way:
rule merge:
input:
fasta="input/{assembly}.fasta",
agps=expand("input/{scaffold}.agp", scaffold=["scf1", "scf2"]),
#bam = "input/Hi-C.bam",
output:
fasta="{assembly}_merged.fasta",
agp="{assembly}_merged.agp",
#links = "{assembly}_merged.links",
params:
extra="",
log:
"logs/ragtag/{assembly}_merged.log",
wrapper:
"v3.9.0/bio/ragtag/merge"
Note that input, output and log file paths can be chosen freely.
When running with
snakemake --use-conda
the software dependencies will be automatically deployed into an isolated environment before execution.
Software dependencies
ragtag=2.1.0
Input/Output
Input:
ref
: assembly fasta file (uncompressed or bgzipped).agps
: scaffolding AGP files.bam
: Optional. Hi-C alignments in BAM format.
Output:
fasta
: The merged scaffolds in FASTA format.agp
: The merged scaffold results in AGP format.links
: Optional. If Hi-C alignments in BAM format were given.
Params
extra
: additional parameters. Do not use with ‘-b’, add the bam file to the input instead.
Code
"""Snakemake wrapper for ragtag-merge."""
__author__ = "Curro Campuzano Jiménez"
__copyright__ = "Copyright 2023, Curro Campuzano Jiménez"
__email__ = "campuzanocurro@gmail.com"
__license__ = "MIT"
from snakemake.shell import shell
import tempfile
log = snakemake.log_fmt_shell(stdout=True, stderr=True)
extra = snakemake.params.get("extra", "")
fasta_file = snakemake.input.get("fasta")
# Check fasta_file is no
assert fasta_file, "Input must contain only one fasta file."
agp_files = snakemake.input.get("agps")
assert len(agp_files) >= 2, "Input must contain at least 2 agp files. Given: %r." % len(
agp_files
)
bam_file = snakemake.input.get("bam")
# Add Hi-C BAM file to params if present
if bam_file:
extra += f" -b {bam_file}"
# Raise warning if links file is expected but no Hi-C BAM file is given
if snakemake.output.get("links") and not bam_file:
raise "Links file is present but no Hi-C BAM file is given."
# Check that all keys in snakemake output are valid are either agp, fasta or links
assert snakemake.output.keys(), "Output must contain at least one named file."
valid_keys = ["agp", "fasta", "links"]
for key in snakemake.output.keys():
assert (
key in valid_keys
), "Invalid key in output. Valid keys are: %r. Given: %r." % (valid_keys, key)
with tempfile.TemporaryDirectory() as tmpdir:
shell(
"ragtag.py merge"
" {fasta_file}"
" {agp_files}"
" {extra}"
" -o {tmpdir}"
" {log}"
)
for key in valid_keys:
outfile = snakemake.output.get(key)
if outfile:
shell("mv {tmpdir}/ragtag.merge.{key} {outfile}")