SEQTK MERGEPE¶
Interleave two paired-end FASTA/Q files
URL: https://github.com/lh3/seqtk
Example¶
This wrapper can be used in the following way:
rule seqtk_mergepe:
input:
r1="{sample}.1.fastq.gz",
r2="{sample}.2.fastq.gz",
output:
merged="{sample}.merged.fastq.gz",
params:
compress_lvl=9,
log:
"logs/seqtk_mergepe/{sample}.log",
threads: 2
wrapper:
"v2.0.0/bio/seqtk/mergepe"
Note that input, output and log file paths can be chosen freely.
When running with
snakemake --use-conda
the software dependencies will be automatically deployed into an isolated environment before execution.
Notes¶
Multiple threads can be used during compression of the output file with pigz.
Software dependencies¶
seqtk=1.3pigz=2.6
Input/Output¶
Input:
- paired fastq files - can be compressed in gzip format (
*.gz).
Output:
- a single, interleaved FASTA/Q file. By default, the output will be compressed, use the param
compress_lvlto change this.
Params¶
compress_lvl: Regulate the speed of compression using the specified digit, where 1 indicates the fastest compression method (less compression) and 9 indicates the slowest compression method (best compression). 0 is no compression. 11 gives a few percent better compression at a severe cost in execution time, using the zopfli algorithm. The default is 6.
Authors¶
- Michael Hall
Code¶
"""Snakemake wrapper for interleaving reads from paired FASTA/Q files using seqtk."""
__author__ = "Michael Hall"
__copyright__ = "Copyright 2021, Michael Hall"
__email__ = "michael@mbh.sh"
__license__ = "MIT"
from snakemake.shell import shell
log = snakemake.log_fmt_shell(stdout=False, stderr=True, append=False)
compress_lvl = int(snakemake.params.get("compress_lvl", 6))
shell(
"(seqtk mergepe {snakemake.input} "
"| pigz -{compress_lvl} -c -p {snakemake.threads}) > {snakemake.output} {log}"
)