RAGTAG-MERGE

https://img.shields.io/github/issues-pr/snakemake/snakemake-wrappers/bio/ragtag/merge?label=version%20update%20pull%20requests

Scaffold merging.

URL: https://github.com/malonge/RagTag/wiki/merge

Example

This wrapper can be used in the following way:

rule merge:
    input:
        fasta="input/{assembly}.fasta",
        agps=expand("input/{scaffold}.agp", scaffold=["scf1", "scf2"]),
        #bam = "input/Hi-C.bam",
    output:
        fasta="{assembly}_merged.fasta",
        agp="{assembly}_merged.agp",
        #links = "{assembly}_merged.links",
    params:
        extra="",
    log:
        "logs/ragtag/{assembly}_merged.log",
    wrapper:
        "v3.9.0/bio/ragtag/merge"

Note that input, output and log file paths can be chosen freely.

When running with

snakemake --use-conda

the software dependencies will be automatically deployed into an isolated environment before execution.

Software dependencies

  • ragtag=2.1.0

Input/Output

Input:

  • ref: assembly fasta file (uncompressed or bgzipped).

  • agps: scaffolding AGP files.

  • bam: Optional. Hi-C alignments in BAM format.

Output:

  • fasta: The merged scaffolds in FASTA format.

  • agp: The merged scaffold results in AGP format.

  • links: Optional. If Hi-C alignments in BAM format were given.

Params

  • extra: additional parameters. Do not use with ‘-b’, add the bam file to the input instead.

Authors

  • Curro Campuzano Jiménez

Code

"""Snakemake wrapper for ragtag-merge."""

__author__ = "Curro Campuzano Jiménez"
__copyright__ = "Copyright 2023, Curro Campuzano Jiménez"
__email__ = "campuzanocurro@gmail.com"
__license__ = "MIT"


from snakemake.shell import shell
import tempfile


log = snakemake.log_fmt_shell(stdout=True, stderr=True)
extra = snakemake.params.get("extra", "")

fasta_file = snakemake.input.get("fasta")
# Check fasta_file is no
assert fasta_file, "Input must contain only one fasta file."

agp_files = snakemake.input.get("agps")

assert len(agp_files) >= 2, "Input must contain at least 2 agp files. Given: %r." % len(
    agp_files
)

bam_file = snakemake.input.get("bam")

# Add Hi-C BAM file to params if present
if bam_file:
    extra += f" -b {bam_file}"

# Raise warning if links file is expected but no Hi-C BAM file is given
if snakemake.output.get("links") and not bam_file:
    raise "Links file is present but no Hi-C BAM file is given."

# Check that all keys in snakemake output are valid are either agp, fasta or links
assert snakemake.output.keys(), "Output must contain at least one named file."
valid_keys = ["agp", "fasta", "links"]
for key in snakemake.output.keys():
    assert (
        key in valid_keys
    ), "Invalid key in output. Valid keys are: %r. Given: %r." % (valid_keys, key)

with tempfile.TemporaryDirectory() as tmpdir:
    shell(
        "ragtag.py merge"
        " {fasta_file}"
        " {agp_files}"
        " {extra}"
        " -o {tmpdir}"
        " {log}"
    )
    for key in valid_keys:
        outfile = snakemake.output.get(key)
        if outfile:
            shell("mv {tmpdir}/ragtag.merge.{key} {outfile}")