.. _`bio/dada2/dereplicate-fastq`: DADA2_DEREPLICATE_FASTQ ======================= .. image:: https://img.shields.io/github/issues-pr/snakemake/snakemake-wrappers/bio/dada2/dereplicate-fastq?label=version%20update%20pull%20requests :target: https://github.com/snakemake/snakemake-wrappers/pulls?q=is%3Apr+is%3Aopen+label%3Abio/dada2/dereplicate-fastq `DADA2 `_ Dereplication of FASTQ files using dada2 ``derepFastq`` function. Optional parameters are documented in the `manual `_ and though the function is not introduced explicitly in the tutorial it is used in under the hood in the `learnErrors` `section `_. Example ------- This wrapper can be used in the following way: .. code-block:: python rule dada2_dereplicate_fastq: input: # Quality filtered FASTQ file "filtered/{fastq}.fastq" output: # Dereplicated sequences stored as `derep-class` object in a RDS file "uniques/{fastq}.RDS" log: "logs/dada2/dereplicate-fastq/{fastq}.log" wrapper: "v3.0.4/bio/dada2/dereplicate-fastq" Note that input, output and log file paths can be chosen freely. When running with .. code-block:: bash snakemake --use-conda the software dependencies will be automatically deployed into an isolated environment before execution. Software dependencies --------------------- * ``bioconductor-dada2=1.28.0`` Input/Output ------------ **Input:** * a FASTQ file **Output:** * RDS file containing a ``derep-class`` object Params ------ * ``optional arguments for ``derepFastq()``, please provide them as python ``key=value`` pairs``: Authors ------- * Charlie Pauvert Code ---- .. code-block:: R # __author__ = "Charlie Pauvert" # __copyright__ = "Copyright 2020, Charlie Pauvert" # __email__ = "cpauvert@protonmail.com" # __license__ = "MIT" # Snakemake wrapper for dereplicating FASTQ files using dada2 derepFastq function. # Sink the stderr and stdout to the snakemake log file # https://stackoverflow.com/a/48173272 log.file<-file(snakemake@log[[1]],open="wt") sink(log.file) sink(log.file,type="message") library(dada2) # Prepare arguments (no matter the order) args<-list( fls = unlist(snakemake@input)) # Check if extra params are passed if(length(snakemake@params) > 0 ){ # Keeping only the named elements of the list for do.call() extra<-snakemake@params[ names(snakemake@params) != "" ] # Add them to the list of arguments args<-c(args, extra) } else{ message("No optional parameters. Using default parameters from dada2::derepFastq()") } # Dereplicate uniques<-do.call(derepFastq, args) # Store as RDS file saveRDS(uniques,snakemake@output[[1]]) # Proper syntax to close the connection for the log file # but could be optional for Snakemake wrapper sink(type="message") sink() .. |nl| raw:: html