Use the art profiler to create a base quality score profile for Illumina read data from a fastq file.

Software dependencies

  • art ==2016.06.05


This wrapper can be used in the following way:

rule art_profiler_illumina:
    params: ""
    threads: 2

Note that input, output and log file paths can be chosen freely. When running with

snakemake --use-conda

the software dependencies will be automatically deployed into an isolated environment before execution.


  • David Laehnemann
  • Victoria Sack


__author__ = "David Laehnemann, Victoria Sack"
__copyright__ = "Copyright 2018, David Laehnemann, Victoria Sack"
__email__ = ""
__license__ = "MIT"

from import shell
import os
import tempfile
import re

# Create temporary directory that will only contain the symbolic link to the
# input file, in order to sanely work with the art_profiler_illumina cli
with tempfile.TemporaryDirectory() as temp_input:
    # ensure that .fastq and .fastq.gz input files work, as well
    filename = os.path.basename(snakemake.input[0] ).replace('.fastq', '.fq')

    # figure out the exact file extension after the above substitution
    ext ="fq(\.gz)?$", filename)
    if ext:
        fq_extension =
        raise IOError("Incompatible extension: This art_profiler_illumina "
                    "wrapper requires input files with one of the following "
                    "extensions: fastq, fastq.gz, fq or fq.gz. Please adjust "
                    "your input and the invocation of the wrapper accordingly.")

                # snakemake paths are relative, but the symlink needs to be absolute
                os.path.abspath(snakemake.input[0] ),
                # the following awkward file name generation has reasons:
                # * the file name needs to be unique to the execution of the
                #   rule, as art will create and mv temporary files with its basename
                #   in the output directory, which causes utter confusion when
                #   executing instances of the rule in parallel
                # * temp file name cannot have any read infixes before the file
                #   extension, because otherwise art does read enumeration magic
                #   that messes up output file naming
                os.path.join(   temp_input,
                                filename.replace("." + fq_extension, "_preventing_art_magic_spacer." + fq_extension )

    # include output folder name in the profile_name command line argument and
    # strip off the file extension, as art will add its own ".txt"
    profile_name = os.path.join( os.path.dirname(snakemake.output[0] ), filename.replace("." + fq_extension, '' ) )

        "( art_profiler_illumina {snakemake.params} {profile_name}"
        " {temp_input} {fq_extension} {snakemake.threads} ) 2> {snakemake.log}")