.. _`bio/hmmer/hmmscan`: HMMSCAN ======= .. image:: https://img.shields.io/github/issues-pr/snakemake/snakemake-wrappers/bio/hmmer/hmmscan?label=version%20update%20pull%20requests :target: https://github.com/snakemake/snakemake-wrappers/pulls?q=is%3Apr+is%3Aopen+label%3Abio/hmmer/hmmscan search protein sequence(s) against a protein profile database Example ------- This wrapper can be used in the following way: .. code-block:: python rule hmmscan_profile: input: fasta="test-protein.fa", profile="test-profile.hmm.h3f", output: # only one of these is required tblout="test-prot-tbl.txt", # save parseable table of per-sequence hits to file domtblout="test-prot-domtbl.txt", # save parseable table of per-domain hits to file pfamtblout="test-prot-pfamtbl.txt", # save table of hits and domains to file, in Pfam format outfile="test-prot-out.txt", # Direct the main human-readable output to a file instead of the default stdout. log: "logs/hmmscan.log" params: evalue_threshold=0.00001, # if bitscore threshold provided, hmmscan will use that instead #score_threshold=50, extra="", threads: 4 wrapper: "v3.0.1/bio/hmmer/hmmscan" Note that input, output and log file paths can be chosen freely. When running with .. code-block:: bash snakemake --use-conda the software dependencies will be automatically deployed into an isolated environment before execution. Software dependencies --------------------- * ``hmmer=3.4`` Input/Output ------------ **Input:** * protein sequence file (fasta) * database hmm files **Output:** * matches to hmm files Authors ------- * N Tessa Pierce Code ---- .. code-block:: python """Snakemake wrapper for hmmscan""" __author__ = "N. Tessa Pierce" __copyright__ = "Copyright 2019, N. Tessa Pierce" __email__ = "ntpierce@gmail.com" __license__ = "MIT" from os import path from snakemake.shell import shell profile = snakemake.input.get("profile") profile = profile.rsplit(".h3", 1)[0] assert profile.endswith(".hmm"), 'your profile file should end with ".hmm" ' # Direct the main human-readable output to a file instead of the default stdout. out_cmd = "" outfile = snakemake.output.get("outfile", "") if outfile: out_cmd += " -o {} ".format(outfile) # save parseable table of per-sequence hits to file tblout = snakemake.output.get("tblout", "") if tblout: out_cmd += " --tblout {} ".format(tblout) # save parseable table of per-domain hits to file domtblout = snakemake.output.get("domtblout", "") if domtblout: out_cmd += " --domtblout {} ".format(domtblout) # save table of hits and domains to file, in Pfam format pfamtblout = snakemake.output.get("pfamtblout", "") if pfamtblout: out_cmd += " --pfamtblout {} ".format(pfamtblout) ## default params: enable evalue threshold. If bitscore thresh is provided, use that instead (both not allowed) # report models >= this score threshold in output evalue_threshold = snakemake.params.get("evalue_threshold", 0.00001) score_threshold = snakemake.params.get("score_threshold", "") if score_threshold: thresh_cmd = " -T {} ".format(float(score_threshold)) else: thresh_cmd = " -E {} ".format(float(evalue_threshold)) # all other params should be entered in "extra" param extra = snakemake.params.get("extra", "") log = snakemake.log_fmt_shell(stdout=False, stderr=True) shell( "hmmscan {out_cmd} {thresh_cmd} --cpu {snakemake.threads}" " {extra} {profile} {snakemake.input.fasta} {log}" ) .. |nl| raw:: html