TRIM_GALORE-PE

Trim paired-end reads using trim_galore.

URL:

Example

This wrapper can be used in the following way:

rule trim_galore_pe:
    input:
        ["reads/{sample}.1.fastq.gz", "reads/{sample}.2.fastq.gz"],
    output:
        "trimmed/{sample}.1_val_1.fq.gz",
        "trimmed/{sample}.1.fastq.gz_trimming_report.txt",
        "trimmed/{sample}.2_val_2.fq.gz",
        "trimmed/{sample}.2.fastq.gz_trimming_report.txt",
    params:
        extra="--illumina -q 20",
    log:
        "logs/trim_galore/{sample}.log",
    wrapper:
        "v1.1.0/bio/trim_galore/pe"

Note that input, output and log file paths can be chosen freely.

When running with

snakemake --use-conda

the software dependencies will be automatically deployed into an isolated environment before execution.

Software dependencies

  • trim-galore==0.6.6

Input/Output

Input:

  • two (paired-end) fastq files (can be gzip compressed)

Output:

  • two trimmed (paired-end) fastq files
  • two trimming reports

Params

  • extra: additional parameters

Notes

  • It is expected that the fastqc Snakemake wrapper be used in place of the –fastqc option.
  • All output files must be placed in the same directory.

Authors

  • Kerrin Mendler

Code

"""Snakemake wrapper for trimming paired-end reads using trim_galore."""

__author__ = "Kerrin Mendler"
__copyright__ = "Copyright 2018, Kerrin Mendler"
__email__ = "mendlerke@gmail.com"
__license__ = "MIT"


from snakemake.shell import shell
import os.path


log = snakemake.log_fmt_shell()

# Check that two input files were supplied
n = len(snakemake.input)
assert n == 2, "Input must contain 2 files. Given: %r." % n

# Don't run with `--fastqc` flag
if "--fastqc" in snakemake.params.get("extra", ""):
    raise ValueError(
        "The trim_galore Snakemake wrapper cannot "
        "be run with the `--fastqc` flag. Please "
        "remove the flag from extra params. "
        "You can use the fastqc Snakemake wrapper on "
        "the input and output files instead."
    )

# Check that four output files were supplied
m = len(snakemake.output)
assert m == 4, "Output must contain 4 files. Given: %r." % m

# Check that all output files are in the same directory
out_dir = os.path.dirname(snakemake.output[0])
for file_path in snakemake.output[1:]:
    assert out_dir == os.path.dirname(file_path), (
        "trim_galore can only output files to a single directory."
        " Please indicate only one directory for the output files."
    )

shell(
    "(trim_galore"
    " {snakemake.params.extra}"
    " --paired"
    " -o {out_dir}"
    " {snakemake.input})"
    " {log}"
)