FASTP¶
trim and QC fastq reads with fastp
Example¶
This wrapper can be used in the following way:
rule fastp_se:
input:
sample=["reads/se/{sample}.fastq"]
output:
trimmed="trimmed/se/{sample}.fastq",
failed="trimmed/se/{sample}.failed.fastq",
html="report/se/{sample}.html",
json="report/se/{sample}.json"
log:
"logs/fastp/se/{sample}.log"
params:
adapters="--adapter_sequence ACGGCTAGCTA",
extra=""
threads: 1
wrapper:
"v2.6.0-35-g755343f/bio/fastp"
rule fastp_pe:
input:
sample=["reads/pe/{sample}.1.fastq", "reads/pe/{sample}.2.fastq"]
output:
trimmed=["trimmed/pe/{sample}.1.fastq", "trimmed/pe/{sample}.2.fastq"],
# Unpaired reads separately
unpaired1="trimmed/pe/{sample}.u1.fastq",
unpaired2="trimmed/pe/{sample}.u2.fastq",
# or in a single file
# unpaired="trimmed/pe/{sample}.singletons.fastq",
merged="trimmed/pe/{sample}.merged.fastq",
failed="trimmed/pe/{sample}.failed.fastq",
html="report/pe/{sample}.html",
json="report/pe/{sample}.json"
log:
"logs/fastp/pe/{sample}.log"
params:
adapters="--adapter_sequence ACGGCTAGCTA --adapter_sequence_r2 AGATCGGAAGAGCACACGTCTGAACTCCAGTCAC",
extra="--merge"
threads: 2
wrapper:
"v2.6.0-35-g755343f/bio/fastp"
rule fastp_pe_wo_trimming:
input:
sample=["reads/pe/{sample}.1.fastq", "reads/pe/{sample}.2.fastq"]
output:
html="report/pe_wo_trimming/{sample}.html",
json="report/pe_wo_trimming/{sample}.json"
log:
"logs/fastp/pe_wo_trimming/{sample}.log"
params:
extra=""
threads: 2
wrapper:
"v2.6.0-35-g755343f/bio/fastp"
Note that input, output and log file paths can be chosen freely.
When running with
snakemake --use-conda
the software dependencies will be automatically deployed into an isolated environment before execution.
Notes¶
- The adapters param allows to specify adapter sequences
- The extra param allows for additional program arguments.
- For more inforamtion see, https://github.com/OpenGene/fastp
Software dependencies¶
fastp=0.23.4
Input/Output¶
Input:
- fastq file(s)
Output:
- trimmed fastq file(s)
- unpaired reads (optional; eihter in a single file or separate)
- merged reads (optional)
- failed reads (optional)
- json file containing trimming statistics
- html file containing trimming statistics
Authors¶
- Sebastian Kurscheid (sebastian.kurscheid@unibas.ch)
- Filipe G. Vieira
Code¶
__author__ = "Sebastian Kurscheid"
__copyright__ = "Copyright 2019, Sebastian Kurscheid"
__email__ = "sebastian.kurscheid@anu.edu.au"
__license__ = "MIT"
from snakemake.shell import shell
import re
extra = snakemake.params.get("extra", "")
adapters = snakemake.params.get("adapters", "")
log = snakemake.log_fmt_shell(stdout=True, stderr=True)
# Assert input
n = len(snakemake.input.sample)
assert (
n == 1 or n == 2
), "input->sample must have 1 (single-end) or 2 (paired-end) elements."
# Input files
if n == 1:
reads = "--in1 {}".format(snakemake.input.sample)
else:
reads = "--in1 {} --in2 {}".format(*snakemake.input.sample)
# Output files
trimmed_paths = snakemake.output.get("trimmed", None)
if trimmed_paths:
if n == 1:
trimmed = "--out1 {}".format(snakemake.output.trimmed)
else:
trimmed = "--out1 {} --out2 {}".format(*snakemake.output.trimmed)
# Output unpaired files
unpaired = snakemake.output.get("unpaired", None)
if unpaired:
trimmed += f" --unpaired1 {unpaired} --unpaired2 {unpaired}"
else:
unpaired1 = snakemake.output.get("unpaired1", None)
if unpaired1:
trimmed += f" --unpaired1 {unpaired1}"
unpaired2 = snakemake.output.get("unpaired2", None)
if unpaired2:
trimmed += f" --unpaired2 {unpaired2}"
# Output merged PE reads
merged = snakemake.output.get("merged", None)
if merged:
if not re.search(r"--merge\b", extra):
raise ValueError(
"output.merged specified but '--merge' option missing from params.extra"
)
trimmed += f" --merged_out {merged}"
else:
trimmed = ""
# Output failed reads
failed = snakemake.output.get("failed", None)
if failed:
trimmed += f" --failed_out {failed}"
# Stats
html = "--html {}".format(snakemake.output.html)
json = "--json {}".format(snakemake.output.json)
shell(
"(fastp --thread {snakemake.threads} "
"{extra} "
"{adapters} "
"{reads} "
"{trimmed} "
"{json} "
"{html} ) {log}"
)