MINIMAP2
A versatile pairwise aligner for genomic and spliced nucleotide sequences.
URL: https://lh3.github.io/minimap2
Example
This wrapper can be used in the following way:
rule minimap2_paf:
input:
target="target/{input1}.mmi", # can be either genome index or genome fasta
query=["query/reads1.fasta", "query/reads2.fasta"],
output:
"aligned/{input1}_aln.paf",
log:
"logs/minimap2/{input1}.log",
params:
extra="-x map-pb", # optional
sorting="coordinate", # optional: Enable sorting. Possible values: 'none', 'queryname' or 'coordinate'
sort_extra="", # optional: extra arguments for samtools/picard
threads: 3
wrapper:
"v9.9.0/bio/minimap2/aligner"
rule minimap2_sam:
input:
target="target/{input1}.mmi", # can be either genome index or genome fasta
query=["query/reads1.fasta", "query/reads2.fasta"],
output:
"aligned/{input1}_aln.sam",
log:
"logs/minimap2/{input1}.log",
params:
extra="-x map-pb", # optional
sorting="none", # optional: Enable sorting. Possible values: 'none', 'queryname' or 'coordinate'
sort_extra="", # optional: extra arguments for samtools/picard
threads: 3
wrapper:
"v9.9.0/bio/minimap2/aligner"
rule minimap2_sam_sorted:
input:
target="target/{input1}.mmi", # can be either genome index or genome fasta
query=["query/reads1.fasta", "query/reads2.fasta"],
output:
"aligned/{input1}_aln.sorted.sam",
log:
"logs/minimap2/{input1}.log",
params:
extra="-x map-pb", # optional
sorting="coordinate", # optional: Enable sorting. Possible values: 'none', 'queryname' or 'coordinate'
sort_extra="", # optional: extra arguments for samtools/picard
threads: 3
wrapper:
"v9.9.0/bio/minimap2/aligner"
rule minimap2_bam_sorted:
input:
target="target/{input1}.mmi", # can be either genome index or genome fasta
query=["query/reads1.fasta", "query/reads2.fasta"],
output:
"aligned/{input1}_aln.sorted.bam",
idx="aligned/{input1}_aln.sorted.bam.bai",
log:
"logs/minimap2/{input1}.log",
params:
extra="-x map-pb", # optional
sorting="coordinate", # optional: Enable sorting. Possible values: 'none', 'queryname' or 'coordinate'
sort_extra="", # optional: extra arguments for samtools/picard
threads: 3
wrapper:
"v9.9.0/bio/minimap2/aligner"
rule minimap2_ubam_paf:
input:
target="target/{input1}.mmi", # can be either genome index or genome fasta
query="query/reads.bam",
output:
"aligned/{input1}_aln.ubam.paf",
log:
"logs/minimap2/{input1}.ubam.log",
params:
extra="-x map-pb", # optional
sorting="coordinate", # optional: Enable sorting. Possible values: 'none', 'queryname' or 'coordinate'
sort_extra="", # optional: extra arguments for samtools/picard
threads: 3
wrapper:
"v9.9.0/bio/minimap2/aligner"
rule minimap2_ubam_sam:
input:
target="target/{input1}.mmi", # can be either genome index or genome fasta
query="query/reads.bam",
output:
"aligned/{input1}_aln.ubam.sam",
log:
"logs/minimap2/{input1}.ubam.log",
params:
extra="-x map-pb", # optional
sorting="none", # optional: Enable sorting. Possible values: 'none', 'queryname' or 'coordinate'
sort_extra="", # optional: extra arguments for samtools/picard
threads: 3
wrapper:
"v9.9.0/bio/minimap2/aligner"
rule minimap2_ubam_sam_sorted:
input:
target="target/{input1}.mmi", # can be either genome index or genome fasta
query="query/reads.bam",
output:
"aligned/{input1}_aln.sorted.ubam.sam",
log:
"logs/minimap2/{input1}.ubam.log",
params:
extra="-x map-pb", # optional
sorting="coordinate", # optional: Enable sorting. Possible values: 'none', 'queryname' or 'coordinate'
sort_extra="", # optional: extra arguments for samtools/picard
threads: 3
wrapper:
"v9.9.0/bio/minimap2/aligner"
rule minimap2_ubam_bam_sorted:
input:
target="target/{input1}.mmi", # can be either genome index or genome fasta
query="query/reads.bam",
output:
"aligned/{input1}_aln.sorted.ubam.bam",
idx="aligned/{input1}_aln.sorted.ubam.bam.bai",
log:
"logs/minimap2/{input1}.ubam.log",
params:
extra="-x map-pb", # optional
sorting="coordinate", # optional: Enable sorting. Possible values: 'none', 'queryname' or 'coordinate'
sort_extra="", # optional: extra arguments for samtools/picard
threads: 3
wrapper:
"v9.9.0/bio/minimap2/aligner"
Note that input, output and log file paths can be chosen freely.
When running with
snakemake --use-conda
the software dependencies will be automatically deployed into an isolated environment before execution.
Software dependencies
minimap2=2.31samtools=1.23.1snakemake-wrapper-utils=0.8.0
Input/Output
Input:
query: FASTQ file(s) or unaligned BAM filetarget: reference genome
Output:
SAM/BAM/CRAM file
idx: Index for SAM/BAM/CRAM file
Params
sort: Enable sorting (if output not PAF), and can be either ‘none’, ‘queryname’ or ‘coordinate’.sort_extra: Additional arguments for samtools/picard.extra: Additional arguments for minimap2.
Code
__author__ = "Tom Poorten"
__copyright__ = "Copyright 2017, Tom Poorten"
__email__ = "tom.poorten@gmail.com"
__license__ = "MIT"
from os import path
from snakemake.shell import shell
from snakemake_wrapper_utils.samtools import infer_out_format
from snakemake_wrapper_utils.samtools import get_samtools_opts
samtools_opts = get_samtools_opts(snakemake, param_name="sort_extra")
extra = snakemake.params.get("extra", "")
log = snakemake.log_fmt_shell(stdout=False, stderr=True)
sort = snakemake.params.get("sorting", "none")
sort_extra = snakemake.params.get("sort_extra", "")
if isinstance(snakemake.input.query, list):
in_ext = infer_out_format(snakemake.input.query[0])
if in_ext == "BAM" and len(snakemake.input.query) > 1:
raise ValueError(f"uBAM input mode only supports a single uBAM file")
else:
in_ext = infer_out_format(snakemake.input.query)
pre_cmd = ""
query = ""
if in_ext == "BAM":
# convert uBAM to fastq keeping all tags
pre_cmd = f'samtools fastq -T "*" {snakemake.input.query} |'
# tell minimap2 to parse tags from fastq header
extra += " -y"
query = "-"
else:
query = snakemake.input.query
out_ext = infer_out_format(snakemake.output[0])
pipe_cmd = f"> {snakemake.output[0]}"
if out_ext != "PAF":
# Add option for SAM output
extra += " -a"
# Determine which pipe command to use for converting to bam or sorting.
if sort == "none":
if out_ext != "SAM":
# Simply convert to output format using samtools view.
pipe_cmd = f"| samtools view -h {samtools_opts}"
elif sort in ["coordinate", "queryname"]:
# Add name flag if needed.
if sort == "queryname":
sort_extra += " -n"
# Sort alignments.
pipe_cmd = f"| samtools sort {sort_extra} {samtools_opts}"
else:
raise ValueError(f"Unexpected value for params.sort: {sort}")
shell(
"({pre_cmd}"
" minimap2"
" -t {snakemake.threads}"
" {extra} "
" {snakemake.input.target}"
" {query}"
" {pipe_cmd}"
") {log}"
)