MERYL COUNT

A genomic k-mer counter (and sequence utility) with nice features.

URL: https://github.com/marbl/meryl

Example

This wrapper can be used in the following way:

rule meryl_count:
    input:
        fasta="{genome}.fasta",
    output:
        directory("{genome}/"),
    log:
        "logs/meryl_count/{genome}.log",
    params:
        command="count",
        extra="k=32",
    threads: 2
    resources:
        mem_mb=2048,
    wrapper:
        "v1.9.0/bio/meryl/count"

Note that input, output and log file paths can be chosen freely.

When running with

snakemake --use-conda

the software dependencies will be automatically deployed into an isolated environment before execution.

Notes

  • The command param allows to specify how to count the kmers: count (canonical kmers) [default], count-forward (only forward kmers), or count-reverse (only reverse kmers).
  • The extra param allows for additional program arguments (kmer size k is mandatory).

Software dependencies

  • meryl=1.3
  • snakemake-wrapper-utils=0.4

Input/Output

Input:

  • fasta file

Output:

  • meryl database

Authors

  • Filipe G. Vieira

Code

__author__ = "Filipe G. Vieira"
__copyright__ = "Copyright 2022, Filipe G. Vieira"
__license__ = "MIT"


from snakemake.shell import shell
from snakemake_wrapper_utils.snakemake import get_mem


extra = snakemake.params.get("extra", "")
log = snakemake.log_fmt_shell(stdout=True, stderr=True)


command = snakemake.params.get("command", "count")
assert command in [
    "count",
    "count-forward",
    "count-reverse",
], "invalid command specified."


mem_gb = get_mem(snakemake, out_unit="GiB")


shell(
    "meryl"
    " {command}"
    " threads={snakemake.threads}"
    " memory={mem_gb}"
    " {extra}"
    " {snakemake.input}"
    " output {snakemake.output}"
    " {log}"
)