NGSCHECKMATE MAKESNVPATTERN

https://img.shields.io/github/issues-pr/snakemake/snakemake-wrappers/bio/ngscheckmate/makesnvpattern?label=version%20update%20pull%20requests

Generate SNP pattern file

URL: https://github.com/parklab/NGSCheckMate?tab=readme-ov-file#1-patterngenerator

Example

This wrapper can be used in the following way:

rule test_ngscheckmate_makesnvpattern:
    input:
        fasta="genome.fasta",
        bed="variants.bed",
        index=multiext(
            "genome_bwt",
            ".1.ebwt",
            ".2.ebwt",
            ".3.ebwt",
            ".4.ebwt",
            ".rev.1.ebwt",
            ".rev.2.ebwt",
        ),
    output:
        pattern="genome.pt",
        fasta=temp("genome.pt.fasta"),
        pattern_uncompressed=temp("genome.pt.txt.sorted"),
    threads: 4
    log:
        "makesnvpattern.log",
    wrapper:
        "v5.7.0/bio/ngscheckmate/makesnvpattern"

Note that input, output and log file paths can be chosen freely.

When running with

snakemake --use-conda

the software dependencies will be automatically deployed into an isolated environment before execution.

Notes

This script uses 4 threads while running bowtie and does not allow user to change this value.

Software dependencies

  • ngscheckmate=1.0.1

Input/Output

Input:

  • bed: Path to bed intervals

  • fasta: Path to fasta genome sequence

  • index: List of paths to bowtie index files

Output:

  • fasta: Path to fasta-formatted regions extracted from bed intervals. Unique numeric names are given to each region.

  • pattern_uncompressed: Path to uncompressed patterns, used for internal patterns checks only. Col1 = Sequence, Col2 = reference max count, Col3 = variant max count.

  • pattern: Path to compressed (binary) pattern file. Main output file.

Authors

  • Thibault Dayris

Code

# coding: utf-8

__author__ = "Thibault Dayris"
__copyright__ = "Copyright 2024, Thibault Dayris"
__email__ = "thibault.dayris@gustaveroussy.fr"
__license__ = "MIT"

from snakemake.shell import shell
from tempfile import TemporaryDirectory
from os.path import commonprefix

log = snakemake.log_fmt_shell(stdout=True, stderr=True, append=True)

index = commonprefix(snakemake.input.index).rstrip(".")

with TemporaryDirectory() as tempdir:
    shell(
        "makesnvpattern.pl "
        "{snakemake.input.bed} "
        "{snakemake.input.fasta} "
        "{index} {tempdir} snake_out {log}"
    )

    # Ensure user can name each file according to their need
    output_mapping = {
        "fasta": f"{tempdir}/snake_out.fasta",
        "pattern": f"{tempdir}/snake_out.pt",
        "pattern_uncompressed": f"{tempdir}/snake_out.uniq.txt.sorted",
    }

    for output_key, temp_file in output_mapping.items():
        output_path = snakemake.output.get(output_key)
        if output_path:
            shell("mv --verbose {temp_file:q} {output_path:q} {log}")