.. _`bio/mashmap`: MASHMAP ======= .. image:: https://img.shields.io/github/issues-pr/snakemake/snakemake-wrappers/bio/mashmap?label=version%20update%20pull%20requests :target: https://github.com/snakemake/snakemake-wrappers/pulls?q=is%3Apr+is%3Aopen+label%3Abio/mashmap Compute local alignment boundaries between long DNA sequences with MashMap **URL**: https://github.com/marbl/MashMap Example ------- This wrapper can be used in the following way: .. code-block:: python rule test_mashmap: input: ref="reference.fasta.gz", # This can be a txt file with a path to a fasta-file per line query="read.fasta.gz", output: "mashmap.out", threads: 2 params: extra="-s 1000 --pi 99", log: "logs/mashmap.log", wrapper: "v3.0.1/bio/mashmap" Note that input, output and log file paths can be chosen freely. When running with .. code-block:: bash snakemake --use-conda the software dependencies will be automatically deployed into an isolated environment before execution. Notes ----- * `input.ref` may be either a path to a fasta file or a text file containing a list of paths to several fasta files. * `input.query` may be either a path to a fastq file or a text file containing a list of paths to several fastq files. Software dependencies --------------------- * ``mashmap=3.1.1`` * ``gsl=2.7`` * ``gzip=1.13`` Input/Output ------------ **Input:** * ``ref``: Path to reference file * ``query``: Path to query file (fasta, fastq) **Output:** * Path to the alignment file Params ------ * ``extra``: Optional parameters for MashMap Authors ------- * Thibault Dayris Code ---- .. code-block:: python #!/usr/bin/python3.8 # coding: utf-8 """ Snakemake wrapper for MashMap """ __author__ = "Thibault Dayris" __copyright__ = "Copyright 2022, Thibault Dayris" __email__ = "thibault.dayris@gustaveroussy.fr" __license__ = "MIT" from snakemake.shell import shell log = snakemake.log_fmt_shell(stdout=True, stderr=True) extra = snakemake.params.get("extra", "") max_threads = snakemake.threads # Handling input file types (either a fasta file, or a text file with a list of paths to fasta files) ref = snakemake.input["ref"] if ref.endswith(".txt"): ref = f"--refList {ref}" elif ref.endswith(".gz"): ref = f"--ref <( gzip --decompress --stdout {ref} )" max_threads -= 1 else: ref = f"--ref {ref}" if max_threads < 1: raise ValueError( "Reference fasta on-the-fly g-unzipping consumed one thread." f" Please increase the number of available threads by {1 - max_threads}." ) # Handling query file format (either a fastq file or a text file with a list of fastq files) query = snakemake.input["query"] if query.endswith(".txt"): query = f"--queryList {query}" else: query = f"--query {query}" shell( "mashmap " "{ref} " "{query} " "--output {snakemake.output} " "--threads {snakemake.threads} " "{extra} " "{log}" ) .. |nl| raw:: html