.. _`bio/seqtk`: SEQTK ===== .. image:: https://img.shields.io/github/issues-pr/snakemake/snakemake-wrappers/bio/seqtk?label=version%20update%20pull%20requests :target: https://github.com/snakemake/snakemake-wrappers/pulls?q=is%3Apr+is%3Aopen+label%3Abio/seqtk Toolkit for processing sequences in FASTA/Q formats **URL**: https://github.com/lh3/seqtk Example ------- This wrapper can be used in the following way: .. code-block:: python rule seqtk_seq_fq2fas: input: "reads/{prefix}.fastq", output: "results/fq2fas/{prefix}.fasta", log: "logs/fq2fas/{prefix}.log", params: command="seq", extra="-A", wrapper: "v3.0.4/bio/seqtk" rule seqtk_seq_convBQ: input: "reads/{prefix}.fastq", output: "results/convBQ/{prefix}.fasta", log: "logs/convBQ/{prefix}.log", params: command="seq", extra="-aQ 64 -q 20 -n N", wrapper: "v3.0.4/bio/seqtk" rule seqtk_subseq_list: input: "reads/{prefix}.fastq", "reads/id.list", output: "results/subseq_list/{prefix}.fq.gz", log: "logs/subseq_list/{prefix}.log", params: command="subseq", extra="", wrapper: "v3.0.4/bio/seqtk" rule seqtk_mergepe: input: r1="reads/{sample}.1.fastq.gz", r2="reads/{sample}.2.fastq.gz", output: merged="results/mergepe/{sample}.fastq.gz", log: "logs/mergepe/{sample}.log", params: command="mergepe", compress_lvl=9, threads: 2 wrapper: "v3.0.4/bio/seqtk" rule seqtk_sample_se: input: "reads/{sample}.fastq.gz", output: "results/sample_se/{sample}.fastq.gz", log: "logs/sample_se/{sample}.log", params: command="sample", n=3, extra="-s 12345", threads: 1 wrapper: "v3.0.4/bio/seqtk" rule seqtk_sample_pe: input: f1="reads/{sample}.1.fastq.gz", f2="reads/{sample}.2.fastq.gz", output: f1="results/sample_pe/{sample}.1.fastq.gz", f2="results/sample_pe/{sample}.2.fastq.gz", log: "logs/sample_pe/{sample}.log", params: command="sample", n=3, extra="-s 12345", threads: 1 wrapper: "v3.0.4/bio/seqtk" Note that input, output and log file paths can be chosen freely. When running with .. code-block:: bash snakemake --use-conda the software dependencies will be automatically deployed into an isolated environment before execution. Notes ----- * Multiple threads can be used during compression of the output file. Software dependencies --------------------- * ``seqtk=1.4`` * ``pigz`` Input/Output ------------ **Input:** * fastx file(s) (can be gzip bcompressed) **Output:** * fastn files (can be gzip bcompressed) Params ------ * ``n``: number of reads after subsampling (for `sample`) * ``extra``: additional program options (e.g. `-s` for `sample` or `-b/-e` for `trimfq`) * ``compress_lvl``: compression level (see `gzip` manual for details) Authors ------- * Filipe G. Vieira Code ---- .. code-block:: python """Snakemake wrapper for SeqTk.""" __author__ = "Filipe G. Vieira" __copyright__ = "Copyright 2023, Filipe G. Vieira" __license__ = "MIT" from snakemake.shell import shell log = snakemake.log_fmt_shell(stdout=False, stderr=True, append=False) extra = snakemake.params.get("extra", "") compress_lvl = snakemake.params.get("compress_lvl", "6") pipe_comp = ( f"| pigz --processes {snakemake.threads} -{compress_lvl} --stdout" if snakemake.output[0].endswith(".gz") else "" ) if snakemake.params.command == "sample": n_reads = snakemake.params.get("n", "") assert len(snakemake.input) == len( snakemake.output ), "Command 'sample' requires same number of input and output files." for in_fx, out_fx in zip(snakemake.input, snakemake.output): shell( "(seqtk {snakemake.params.command} {extra} {in_fx} {n_reads} {pipe_comp} > {out_fx}) {log}" ) else: shell( "(seqtk {snakemake.params.command} {extra} {snakemake.input} {pipe_comp} > {snakemake.output}) {log}" ) .. |nl| raw:: html