FASTP

trim and QC fastq reads with fastp

Software dependencies

  • fastp ==0.20.0

Example

This wrapper can be used in the following way:

rule fastp_se:
    input:
        sample=["reads/se/{sample}.fastq"]
    output:
        trimmed="trimmed/se/{sample}.fastq",
        html="report/se/{sample}.html",
        json="report/se/{sample}.json"
    log:
        "logs/fastp/se/{sample}.log"
    params:
        extra=""
    threads: 1
    wrapper:
        "0.35.2/bio/fastp"


rule fastp_pe:
    input:
        sample=["reads/pe/{sample}.1.fastq", "reads/pe/{sample}.2.fastq"]
    output:
        trimmed=["trimmed/pe/{sample}.1.fastq", "trimmed/pe/{sample}.2.fastq"],
        html="report/pe/{sample}.html",
        json="report/pe/{sample}.json"
    log:
        "logs/fastp/pe/{sample}.log"
    params:
        extra=""
    threads: 2
    wrapper:
        "0.35.2/bio/fastp"

Note that input, output and log file paths can be chosen freely. When running with

snakemake --use-conda

the software dependencies will be automatically deployed into an isolated environment before execution.

Authors

Code

__author__ = "Sebastian Kurscheid"
__copyright__ = "Copyright 2019, Sebastian Kurscheid"
__email__ = "sebastian.kurscheid@anu.edu.au"
__license__ = "MIT"

from snakemake.shell import shell

extra = snakemake.params.get("extra", "")
log = snakemake.log_fmt_shell(stdout=True, stderr=True)

n = len(snakemake.input.sample)
assert n == 1 or n == 2, "input->sample must have 1 (single-end) or 2 (paired-end) elements."

if n == 1:
    reads = "--in1 {}".format(snakemake.input.sample)
    trimmed = "--out1 {}".format(snakemake.output.trimmed)
else:
    reads = "--in1 {} --in2 {}".format(*snakemake.input.sample)
    trimmed = "--out1 {} --out2 {}".format(*snakemake.output.trimmed)

html = "--html {}".format(snakemake.output.html)
json = "--json {}".format(snakemake.output.json)

shell(
    "(fastp --thread {snakemake.threads} {snakemake.params.extra} "
    "{reads} "
    "{trimmed} "
    "{json} "
    "{html} ) {log}")