.. _`bio/mlst`: MLST ==== .. image:: https://img.shields.io/github/issues-pr/snakemake/snakemake-wrappers/bio/mlst?label=version%20update%20pull%20requests :target: https://github.com/snakemake/snakemake-wrappers/pulls?q=is%3Apr+is%3Aopen+label%3Abio/mlst Scan contig files against traditional PubMLST typing schemes Example ------- This wrapper can be used in the following way: .. code-block:: python rule run_mlst: input: #Input assembly assembly="{sample}.fasta", output: #Tab delimited mlst designation mlst="{sample}_mlst.txt", params: #extra parameters should be space delimited # SYNOPSIS # Automatic MLST calling from assembled contigs # USAGE # % mlst --list # list known schemes # % mlst [options] # force a scheme # GENERAL # --help This help # --version Print version and exit(default ON) # --check Just check dependencies and exit (default OFF) # --quiet Quiet - no stderr output (default OFF) # --threads [N] Number of BLAST threads (suggest GNU Parallel instead) (default '1') # --debug Verbose debug output to stderr (default OFF) # SCHEME # --scheme [X] Don't autodetect, force this scheme on all inputs (default '') # --list List available MLST scheme names (default OFF) # --longlist List allelles for all MLST schemes (default OFF) # --exclude [X] Ignore these schemes (comma sep. list) (default 'ecoli_2,abaumannii') # OUTPUT # --csv Output CSV instead of TSV (default OFF) # --json [X] Also write results to this file in JSON format (default '') # --label [X] Replace FILE with this name instead (default '') # --nopath Strip filename paths from FILE column (default OFF) # --novel [X] Save novel alleles to this FASTA file (default '') # --legacy Use old legacy output with allele header row (requires --scheme) (default OFF) # SCORING # --minid [n.n] DNA %identity of full allelle to consider 'similar' [~] (default '95') # --mincov [n.n] DNA %cov to report partial allele at all [?] (default '10') # --minscore [n.n] Minumum score out of 100 to match a scheme (when auto --scheme) (default '50') # PATHS # --blastdb [X] BLAST database # --datadir [X] PubMLST data # HOMEPAGE # https://github.com/tseemann/mlst - Torsten Seemann extra="--nopath", log: "logs/{sample}.mlst.log", threads: 1 wrapper: "v3.0.1/bio/mlst" Note that input, output and log file paths can be chosen freely. When running with .. code-block:: bash snakemake --use-conda the software dependencies will be automatically deployed into an isolated environment before execution. Notes ----- * The `extra` param allows for additional program arguments. * For more inforamtion see https://github.com/tseemann/mlst Software dependencies --------------------- * ``mlst=2.23.0`` Input/Output ------------ **Input:** * Genomic assembly (fasta format) **Output:** * Returns a tab-separated line containing the filename, matching PubMLST scheme name, ST (sequence type) and the allele IDs. Other output formats are also available (eg. CSV, JSON) Authors ------- * Torsten Seeman (mlst tool) - https://github.com/tseemann/mlst * Max Cummins (Snakemake wrapper [unaffiliated with Torsten Seeman]) Code ---- .. code-block:: python __author__ = "Max Cummins" __copyright__ = "Copyright 2021, Max Cummins" __email__ = "max.l.cummins@gmail.com" __license__ = "MIT" from snakemake.shell import shell from os import path log = snakemake.log_fmt_shell(stdout=False, stderr=True) shell( "mlst" " {snakemake.params.extra}" " {snakemake.input.assembly}" " > {snakemake.output.mlst}" " {log}" ) .. |nl| raw:: html