fastq_screen screens a library of sequences in FASTQ format against a set of sequence databases so you can see if the composition of the library matches with what you expect.

This wrapper allows the configuration to be passed as a filename or as a dictionary in the rule’s params.fastq_screen_config of the rule. So the following configuration file:

DATABASE      ecoli   /data/Escherichia_coli/Bowtie2Index/genome      BOWTIE2
DATABASE      ecoli   /data/Escherichia_coli/Bowtie2Index/genome      BOWTIE
DATABASE      hg19    /data/hg19/Bowtie2Index/genome  BOWTIE2
DATABASE      mm10    /data/mm10/Bowtie2Index/genome  BOWTIE2
BOWTIE        /path/to/bowtie
BOWTIE2       /path/to/bowtie2


fastq_screen_config = {
 'database': {
   'ecoli': {
     'bowtie2': '/data/Escherichia_coli/Bowtie2Index/genome',
     'bowtie': '/data/Escherichia_coli/BowtieIndex/genome'},
   'hg19': {
     'bowtie2': '/data/hg19/Bowtie2Index/genome'},
   'mm10': {
     'bowtie2': '/data/mm10/Bowtie2Index/genome'}
 'aligner_paths': {'bowtie': 'bowtie', 'bowtie2': 'bowtie2'}

By default, the wrapper will use bowtie2 as the aligner and a subset of 100000 reads. These can be overridden using params.aligner and params.subset respectively. Furthermore, params.extra can be used to pass additional arguments verbatim to fastq_screen, for example extra="--illumina1_3" or extra="--bowtie2 '--trim5=8'".

Software dependencies

  • fastq-screen ==0.5.2
  • bowtie2 ==2.2.6
  • bowtie ==1.1.2


This wrapper can be used in the following way:

rule fastq_screen:
    threads: 8

Note that input, output and log file paths can be chosen freely. When running with

snakemake --use-conda

the software dependencies will be automatically deployed into an isolated environment before execution.


  • fastq_screen hard-codes the output filenames. This wrapper moves the hard-coded output files to those specified by the rule.
  • While the dictionary form of fastq_screen_config is convenient, the unordered nature of the dictionary may cause snakemake --list-params-changed to incorrectly report changed parameters even though the contents remain the same. If you plan on using --list-params-changed then it will be better to write a config file and pass that as fastq_screen_config. This problem will disappear with Python 3.6.
  • When providing the dictionary form of fastq_screen_config, the wrapper will write a temp file using Python’s tempfile module. To control the temp file directory, make sure the $TMPDIR environmental variable is set (see the tempfile docs) for details). One way of doing this is by adding something like shell.prefix("export TMPDIR=/scratch; ") to the snakefile calling this wrapper.


  • Ryan Dale


import os
from import shell
import tempfile

__author__ = "Ryan Dale"
__copyright__ = "Copyright 2016, Ryan Dale"
__email__ = ""
__license__ = "MIT"

_config = snakemake.params['fastq_screen_config']

subset = snakemake.params.get('subset', 100000)
aligner = snakemake.params.get('aligner', 'bowtie2')
extra = snakemake.params.get('extra', '')
log = snakemake.log_fmt_shell()

# snakemake.params.fastq_screen_config can be either a dict or a string. If
# string, interpret as a filename pointing to the fastq_screen config file.
# Otherwise, create a new tempfile out of the contents of the dict:
if isinstance(_config, dict):
    tmp = tempfile.NamedTemporaryFile(delete=False).name
    with open(tmp, 'w') as fout:
        for label, indexes in _config['database'].items():
            for aligner, index in indexes.items():
                    'DATABASE', label, index, aligner.upper()]) + '\n')
        for aligner, path in _config['aligner_paths'].items():
            fout.write('\t'.join([aligner.upper(), path]) + '\n')
    config_file = tmp
    config_file = _config

# fastq_screen hard-codes filenames according to this prefix. We will send
# hard-coded output to a temp dir, and then move them later.
prefix = os.path.basename(snakemake.input[0].split('.fastq')[0])
tempdir = tempfile.mkdtemp()

    "fastq_screen --outdir {tempdir} "
    "--force "
    "--aligner {aligner} "
    "--conf {config_file} "
    "--subset {subset} "
    "--threads {snakemake.threads} "
    "{extra} "
    "{snakemake.input[0]} "

# Move output to the filenames specified by the rule
shell("mv {tempdir}/{prefix}_screen.txt {snakemake.output.txt}")
shell("mv {tempdir}/{prefix}_screen.png {snakemake.output.png}")

# Clean up temp
shell("rm -r {tempdir}")
if isinstance(_config, dict):
    shell("rm {tmp}")