MINIMAP2

A versatile pairwise aligner for genomic and spliced nucleotide sequences.

URL:

Example

This wrapper can be used in the following way:

rule minimap2_paf:
    input:
        target="target/{input1}.mmi", # can be either genome index or genome fasta
        query=["query/reads1.fasta", "query/reads2.fasta"]
    output:
        "aligned/{input1}_aln.paf"
    log:
        "logs/minimap2/{input1}.log"
    params:
        extra="-x map-pb",           # optional
        sorting="coordinate",           # optional: Enable sorting. Possible values: 'none', 'queryname' or 'coordinate'
        sort_extra=""                # optional: extra arguments for samtools/picard
    threads: 3
    wrapper:
        "v1.1.0/bio/minimap2/aligner"

rule minimap2_sam:
    input:
        target="target/{input1}.mmi", # can be either genome index or genome fasta
        query=["query/reads1.fasta", "query/reads2.fasta"]
    output:
        "aligned/{input1}_aln.sam"
    log:
        "logs/minimap2/{input1}.log"
    params:
        extra="-x map-pb",           # optional
        sorting="none",                 # optional: Enable sorting. Possible values: 'none', 'queryname' or 'coordinate'
        sort_extra=""                # optional: extra arguments for samtools/picard
    threads: 3
    wrapper:
        "v1.1.0/bio/minimap2/aligner"

rule minimap2_bam:
    input:
        target="target/{input1}.mmi", # can be either genome index or genome fasta
        query=["query/reads1.fasta", "query/reads2.fasta"]
    output:
        "aligned/{input1}_aln.bam"
    log:
        "logs/minimap2/{input1}.log"
    params:
        extra="-x map-pb",           # optional
        sorting="coordinate",           # optional: Enable sorting. Possible values: 'none', 'queryname' or 'coordinate'
        sort_extra=""                # optional: extra arguments for samtools/picard
    threads: 3
    wrapper:
        "v1.1.0/bio/minimap2/aligner"

Note that input, output and log file paths can be chosen freely.

When running with

snakemake --use-conda

the software dependencies will be automatically deployed into an isolated environment before execution.

Software dependencies

  • minimap2==2.17
  • samtools==1.12

Input/Output

Input:

  • FASTQ file(s)
  • reference genome

Output:

  • SAM/BAM/CRAM file

Notes

  • The extra param allows for additional arguments for minimap2.
  • The sort param allows to enable sorting (if output not PAF), and can be either ‘none’, ‘queryname’ or ‘coordinate’.
  • The sort_extra allows for extra arguments for samtools/picard
  • For more inforamtion see, https://lh3.github.io/minimap2

Authors

  • Tom Poorten
  • Michael Hall
  • Filipe G. Vieira

Code

__author__ = "Tom Poorten"
__copyright__ = "Copyright 2017, Tom Poorten"
__email__ = "tom.poorten@gmail.com"
__license__ = "MIT"

from os import path
from snakemake.shell import shell


inputQuery = " ".join(snakemake.input.query)

# Extract output format
out_name, out_ext = path.splitext(snakemake.output[0])
out_ext = out_ext[1:].upper()

# Extract arguments.
extra = snakemake.params.get("extra", "")

sort = snakemake.params.get("sorting", "none")
sort_extra = snakemake.params.get("sort_extra", "")

log = snakemake.log_fmt_shell(stdout=False, stderr=True)

pipe_cmd = ""
if out_ext != "PAF":
    # Add option for SAM output
    extra += " -a"

    # Determine which pipe command to use for converting to bam or sorting.
    if sort == "none":

        if out_ext != "SAM":
            # Simply convert to output format using samtools view.
            pipe_cmd = "| samtools view -h --output-fmt {} -".format(out_ext)

    elif sort in ["coordinate", "queryname"]:

        # Add name flag if needed.
        if sort == "queryname":
            sort_extra += " -n"

        # Sort alignments.
        pipe_cmd = "| samtools sort {} --output-fmt {} -".format(sort_extra, out_ext)

    else:
        raise ValueError("Unexpected value for params.sort ({})".format(sort))


shell(
    "(minimap2 -t {snakemake.threads} {extra} "
    "{snakemake.input.target} {inputQuery} {pipe_cmd} > {snakemake.output[0]}) {log}"
)