.. _`bio/art/profiler_illumina`: ART_PROFILER_ILLUMINA ===================== .. image:: https://img.shields.io/github/issues-pr/snakemake/snakemake-wrappers/bio/art/profiler_illumina?label=version%20update%20pull%20requests :target: https://github.com/snakemake/snakemake-wrappers/pulls?q=is%3Apr+is%3Aopen+label%3Abio/art/profiler_illumina Use the art profiler to create a base quality score profile for Illumina read data from a fastq file. **URL**: https://www.niehs.nih.gov/research/resources/software/biostatistics/art/index.cfm Example ------- This wrapper can be used in the following way: .. code-block:: python rule art_profiler_illumina: input: "data/{sample}.fq", output: "profiles/{sample}.txt" log: "logs/art_profiler_illumina/{sample}.log" params: "" threads: 2 wrapper: "v3.0.1/bio/art/profiler_illumina" Note that input, output and log file paths can be chosen freely. When running with .. code-block:: bash snakemake --use-conda the software dependencies will be automatically deployed into an isolated environment before execution. Notes ----- Your input file must have one of the following extensions: fastq, fastq.gz, fq or fq.gz Software dependencies --------------------- * ``art=2016.06.05`` Input/Output ------------ **Input:** * Path to fastq-formatted input file (first place in the input list of files) **Output:** * Path to txt formatted profile (first place in the output list of files) Params ------ * ``Extra parameters (no keyword mapped parameter)``: Authors ------- * David Laehnemann * Victoria Sack Code ---- .. code-block:: python __author__ = "David Laehnemann, Victoria Sack" __copyright__ = "Copyright 2018, David Laehnemann, Victoria Sack" __email__ = "david.laehnemann@hhu.de" __license__ = "MIT" from snakemake.shell import shell import os import tempfile import re # Create temporary directory that will only contain the symbolic link to the # input file, in order to sanely work with the art_profiler_illumina cli with tempfile.TemporaryDirectory() as temp_input: # ensure that .fastq and .fastq.gz input files work, as well filename = os.path.basename(snakemake.input[0]).replace(".fastq", ".fq") # figure out the exact file extension after the above substitution ext = re.search("fq(\.gz)?$", filename) if ext: fq_extension = ext.group(0) else: raise IOError( "Incompatible extension: This art_profiler_illumina " "wrapper requires input files with one of the following " "extensions: fastq, fastq.gz, fq or fq.gz. Please adjust " "your input and the invocation of the wrapper accordingly." ) os.symlink( # snakemake paths are relative, but the symlink needs to be absolute os.path.abspath(snakemake.input[0]), # the following awkward file name generation has reasons: # * the file name needs to be unique to the execution of the # rule, as art will create and mv temporary files with its basename # in the output directory, which causes utter confusion when # executing instances of the rule in parallel # * temp file name cannot have any read infixes before the file # extension, because otherwise art does read enumeration magic # that messes up output file naming os.path.join( temp_input, filename.replace( "." + fq_extension, "_preventing_art_magic_spacer." + fq_extension ), ), ) # include output folder name in the profile_name command line argument and # strip off the file extension, as art will add its own ".txt" profile_name = os.path.join( os.path.dirname(snakemake.output[0]), filename.replace("." + fq_extension, "") ) shell( "( art_profiler_illumina {snakemake.params} {profile_name}" " {temp_input} {fq_extension} {snakemake.threads} ) 2> {snakemake.log}" ) .. |nl| raw:: html