SEQTK-SUBSAMPLE-SE¶
Subsample reads from FASTQ file
Example¶
This wrapper can be used in the following way:
rule seqtk_subsample_se:
input:
"{sample}.fastq.gz"
output:
"{sample}.subsampled.fastq.gz"
params:
n=3,
seed=12345
log:
"logs/seqtk_subsample/{sample}.log"
threads:
1
wrapper:
"v1.19.1/bio/seqtk/subsample/se"
Note that input, output and log file paths can be chosen freely.
When running with
snakemake --use-conda
the software dependencies will be automatically deployed into an isolated environment before execution.
Software dependencies¶
seqtk==1.3
pigz=2.3
Input/Output¶
Input:
- fastq file (can be gzip compressed)
Output:
- subsampled fastq file (gzip compressed)
Params¶
n
: number of reads after subsamplingseed
: seed to initialize a pseudorandom number generator
Authors¶
- Fabian Kilpert
Code¶
"""Snakemake wrapper for subsampling reads from FASTQ file using seqtk."""
__author__ = "Fabian Kilpert"
__copyright__ = "Copyright 2020, Fabian Kilpert"
__email__ = "fkilpert@gmail.com"
__license__ = "MIT"
from snakemake.shell import shell
log = snakemake.log_fmt_shell()
shell(
"( "
"seqtk sample "
"-s {snakemake.params.seed} "
"{snakemake.input} "
"{snakemake.params.n} "
"| pigz -9 -p {snakemake.threads} "
"> {snakemake.output} "
") {log} "
)