Contributing
We invite anybody to contribute to the Snakemake Wrapper Repository. If you want to contribute we suggest the following procedure:
Fork the repository: https://github.com/snakemake/snakemake-wrappers
Clone your fork locally.
Locally, create a new branch:
git checkout -b my-new-snakemake-wrapper
Commit your contributions to that branch and push them to your fork:
git push -u origin my-new-snakemake-wrapper
Create a pull request.
The pull request will be reviewed and included as fast as possible.
If your pull request does not get a review quickly, you can @mention <https://github.blog/2011-03-23-mention-somebody-they-re-notified/> previous contributors to a particular wrapper (git blame
) or regular contributors that you think might be able to give a review.
In general, always take inspiration from existing wrappers. And then, contributions should:
provide the following files:
meta.yaml
(wrapper description), see meta.yaml fileenvironment.yaml
(required software), see environment.yaml fileenvironment.linux-64.pin.yaml
(autogenerated pinning of the software), see environment.yaml filewrapper.py
orwrapper.R
(actual wrapper code), see wrapper.py or wrapper.R filetest/Snakefile
(minimal test cases and copy-pasteable examples), see test/Snakefile file
amend
test.py
to call all of the testing rules provided intest/Snakefile
, see test.py tests fileensure consistent:
formatting of Python files
linting of Snakefiles
conda
/mamba
environment for development
To have all the tools you need for developing and testing wrappers in one single conda
/mamba
environment:
Set up the channels as described for bioconda.
Create an environment with the necessary dependencies:
mamba create -n snakemake-wrappers-development -c conda-forge -c bioconda snakemake snakefmt snakedeploy black mamba pytest
Activate the environment with:
mamba activate snakemake-wrappers-development
meta.yaml
file
This file describes the wrapper and how to use it.
The general file syntax is YAML. Text / strings (values in a YAML mapping or sequence), can use reStructuredText syntax.
The following fields are available to use in the wrapper meta.yaml
file.
All, except those marked optional, should be provided.
Especially make sure to include a URL of the respective tool’s documentation.
name: The name of the wrapper.
description: a description of what the wrapper does.
url: URL to the wrapper tool webpage.
authors: A sequence of names of the people who have contributed to the wrapper.
input: A mapping or sequence of required inputs for the wrapper.
output: A mapping or sequence of output(s) from the wrapper.
params (optional): A mapping of parameters that can be used in the wrapper’s
params
directive. If no parameters are used for the wrapper, this field can be omitted.notes (optional): Anything of note that does not fit into the scope of the other fields.
You can add a newline to the rendered text in these fields with the addition of |nl|
.
Example
name: seqtk mergepe
description: Interleave two paired-end FASTA/Q files
url: https://github.com/lh3/seqtk
authors:
- Michael Hall
input:
- paired fastq files - can be compressed.
output:
- >
a single, interleaved FASTA/Q file. By default, the output will be compressed,
use the param ``compress_lvl`` to change this.
params:
compress_lvl: >
Regulate the speed of compression using the specified digit,
where 1 indicates the fastest compression method (less compression)
and 9 indicates the slowest compression method (best compression).
0 is no compression. 11 gives a few percent better compression at a severe cost
in execution time, using the zopfli algorithm. The default is 6.
notes: Multiple threads can be used during compression of the output file with ``pigz``.
environment.yaml
file
This file needs to list all the software that the wrapper code needes to run successfully.
For all software following semantic versioning conventions, specify (and thus pin) the major and minor version, but leave the patch version unspecified.
Also, unless this is needed to work around version incompatibilities not properly handled by the conda packages themselves, only specify the actual software needed and let conda
/mamba
determine the dependencies.
To make sure that conda
/mamba
knows where to look for the package, include a list of all of the conda channels that the software and its dependencies require.
This will usually include conda-forge, as it contains many essential libraries that other packages and tools depend on.
This channel should usually be specified first, to make sure it takes precedence (snakemake
asks users to conda config --set channel_priority strict
).
In addition, you may need to include other sustainable community maintained channels (like bioconda).
And as the last channel specification, always include nodefaults
.
This avoids software dependency conflicts between the conda-forge
channel and the default
channels that should not be needed nowadays.
Finally, make sure to run snakedeploy pin-conda-envs environment.yaml
on the finished environment specification.
This will generate a file called environment.linux-64.pin.txt
with all the dependency versions determined by conda
/mamba
, ensuring that a particular wrapper version will always generate the exact same environment with the exact package versions from this file.
You should include this pinning file in the pull request for your wrapper.
Example
channels:
- conda-forge
- bioconda
- nodefaults
dependencies:
- bioconductor-biomart =2.58
- r-nanoparquet =0.3
- r-tidyverse = 2.0
wrapper.py
or wrapper.R
file
This is the actual code that the wrapper executes. It is handled like an external script in snakemake, so you have the respective snakemake objects available.
Please ensure that the wrapper:
can deal with arbitrary
input:
andoutput:
paths and filenamesredirects stdout and stderr to log files specified by the log: directive (typical boilerplate code can for example be found in this knowledge base)
automatically infers command line arguments wherever possible (for example based on file extensions in
input:
andoutput:
)passes on the threads value, if the used tool(s) allow(s) it
writes any temporary files to a unique hidden folder in the working directory, or (better) stores them where the Python function tempfile.gettempdir() points (this also means that using any Python tempfile default behavior works)
is formatted according to the language’s standards (for Python, format it with black:
black wrapper.py
)
For repeatedly needed functionality you can use the snakemake-wrapper-utils. Use what is available or create new functionality there, whenever you start repeating functions across wrappers. Examples of this are:
The command line argument parsing for a software tool like
samtools
where you create one wrapper each for a number of different subcommands that share the main arguments. See the samtools.py utility functions for the respective functionality.The handling of recurring Java options, for example things like memory handling. See java.py for the respective functionality.
To use snakemake-wrapper-utils
, you have to include them as a depenency in your environment.yaml file definition file and import the respective function(s) in your wrapper.py or wrapper.R file script (for example from snakemake_wrapper_utils.java import get_java_opts
).
test/Snakefile
file
In a subfolder called test
, create a Snakefile
with example invocations of the wrapper.
These examples should comprehensively showcase the available functionality of the wrapper, as they serve as both the copy-pasteable examples rendered in the documentation, and the test cases run in the continuous integration testing (make sure to include calls to the rules in test.py
, see test.py tests file).
If these rules need any input data, you can also include minimal (small) testing data in the test/
folder (also check existing wrappers for suitable data).
When writing the Snakefile
, please ensure that:
rule names in the examples are in snake_case and descriptive (they should explain what the rule is does, or match the tool’s purpose or name; for example
map_reads
for a step that maps reads)it is formatted correctly by running snakefmt (
snakefmt Snakefile
)it also passes linting, see Linting
all example rules in your
test/Snakefile
have an invocation as a test case intest.py
, see test.py tests filewherever you can do this with a short comment, explain possible settings for all keywords like
input:
,output:
,params:
,threads:
, etc. (provide longer explanations in the meta.yaml file file)provide a sensible default for
threads:
, if more than one thread can be used by the wrapper
test.py
tests file
Every example rule listed in a test/Snakefile file, should be included as a test case in test.py
.
The easiest way is usually to duplicate an existing test and adapt it to your newly added example rule.
When done editing, make sure that test.py
Formatting still follows |black|_ standards.
Example
@skip_if_not_modified
def test_bcftools_sort():
run(
"bio/bcftools/sort",
["snakemake", "--cores", "1", "--use-conda", "-F", "a.sorted.bcf"],
)
Formatting
Please ensure Python files such as test.py
and wrapper.py
are formatted with
|black|_. Additionally, please format your test Snakefile
with snakefmt
.
Linting
Please lint your test Snakefile
with:
snakemake -s <path/to/wrapper/test/Snakefile> --lint
Testing locally
If you want to debug your contribution locally before creating a pull request, ensure you have the conda/mamba environment for development installed and activated.
Afterwards, from the main directory of the repo, you can run the test(s) for your
contribution by specifying an expression
that matches the name(s) of your test(s) via the -k
option of pytest
:
pytest test.py -v -k your_test
If you also want to test the docs generation locally, create another environment and activate it:
mamba create -n test-snakemake-wrapper-docs -c conda-forge sphinx sphinx_rtd_theme pyyaml sphinx-copybutton sphinxawesome_theme myst-parser
mamba activate test-snakemake-wrapper-docs
Then, enter the respective directory and build the docs:
cd docs
make html
If it runs through, you can open the main page at docs/_build/html/index.html
in a web browser.
If you want to start fresh, you can clean up the build with make clean
.