RAGTAG-PATH

https://img.shields.io/github/issues-pr/snakemake/snakemake-wrappers/bio/ragtag/patch?label=version%20update%20pull%20requests

Homology-based assembly patching.

URL: https://github.com/malonge/RagTag/wiki/patch

Example

This wrapper can be used in the following way:

rule patch:
    input:
        query="fasta/{query}.fasta",
        ref="fasta/{reference}.fasta",
    output:
        agp="{query}_{reference}.agp",
        fasta="{query}_{reference}.fasta",
        rename_agp="{query}_{reference}.rename.agp",
        rename_fasta="{query}_{reference}.rename.fasta",
        ctg_agp="{query}_{reference}.ctg.agp",
        ctg_fasta="{query}_{reference}.ctg.fasta",
        comps_fasta="{query}_{reference}.comps.fasta",
        asm_dir=directory("{query}_{reference}_asm"),  # Assembly alignment files
    params:
        extra="",
    threads: 16
    log:
        "logs/ragtag/{query}_patch_{reference}.log",
    wrapper:
        "v3.8.0-1-g149ef14/bio/ragtag/patch"

Note that input, output and log file paths can be chosen freely.

When running with

snakemake --use-conda

the software dependencies will be automatically deployed into an isolated environment before execution.

Notes

Multiple threads can be used during Minimap/Unimap alignment.

Software dependencies

  • ragtag=2.1.0

Input/Output

Input:

  • ref: reference fasta file (uncompressed or bgzipped)

  • query: query fasta file (uncompressed or bgzipped)

Output:

  • fasta: The final FASTA file containing the patched assembly

  • agp: The final AGP file defining how ragtag.patch.fasta is built.

  • rename_agp: Optional. An AGP file defining the new names for query sequences

  • rename_fasta: Optional. A FASTA file with the original query sequence, but with new names.

  • comps_fasta: Optional. The split target assembly and the renamed query assembly combined into one FASTA file.

  • ctg_agp: Optional. An AGP file defining how the target assembly was split at gaps

  • ctg_fasta: Optional. The target assembly split at gaps

  • asm_dir: Optional. A directory containing Assembly alignment files.

Params

  • extra: additional parameters

Authors

  • Curro Campuzano Jiménez

Code

"""Snakemake wrapper for ragtag-patch."""

__author__ = "Curro Campuzano Jiménez"
__copyright__ = "Copyright 2023, Curro Campuzano Jiménez"
__email__ = "campuzanocurro@gmail.com"
__license__ = "MIT"

import tempfile
from snakemake.shell import shell


log = snakemake.log_fmt_shell(stdout=True, stderr=True)
extra = snakemake.params.get("extra", "")

assert snakemake.output.keys(), "Output must contain at least one named file."

valid_keys = [
    "agp",
    "fasta",
    "rename_agp",
    "rename_fasta",
    "comps_fasta",
    "ctg_agp",
    "ctg_fasta",
    "asm_dir",
]
for key in snakemake.output.keys():
    assert (
        key in valid_keys
    ), "Invalid key in output. Valid keys are: %r. Given: %r." % (valid_keys, key)

with tempfile.TemporaryDirectory() as tmpdir:
    shell(
        "ragtag.py patch"
        " {snakemake.input.ref}"
        " {snakemake.input.query}"
        " {extra}"
        " -o {tmpdir} -t {snakemake.threads}"
        " {log}"
    )
    for key in valid_keys[:-1]:
        outfile = snakemake.output.get(key)
        if outfile:
            extension = key.replace("_", ".")
            shell("mv {tmpdir}/ragtag.patch.{extension} {outfile}")
    outdir = snakemake.output.get("asm_dir")
    if outdir:
        # Move files into directory outdir
        shell("mkdir -p {outdir} && mv {tmpdir}/ragtag.patch.asm.* {outdir}")