.. _`bio/gridss/preprocess`: GRIDSS PREPROCESS ================= .. image:: https://img.shields.io/github/issues-pr/snakemake/snakemake-wrappers/bio/gridss/preprocess?label=version%20update%20pull%20requests :target: https://github.com/snakemake/snakemake-wrappers/pulls?q=is%3Apr+is%3Aopen+label%3Abio/gridss/preprocess GRIDSS is a module software suite containing tools useful for the detection of genomic rearrangements. It includes a genome-wide break-end assembler, as well as a structural variation caller for Illumina sequencing data. ``preprocess`` pre-processes input BAM files (can be run per file). **URL**: https://github.com/PapenfussLab/gridss Example ------- This wrapper can be used in the following way: .. code-block:: python WORKING_DIR="working_dir" rule gridss_preprocess: input: bam="mapped/{sample}.bam", bai="mapped/{sample}.bam.bai", reference="reference/genome.fasta", dictionary="reference/genome.dict", refindex=multiext("reference/genome.fasta", ".amb", ".ann", ".bwt", ".pac", ".sa") output: multiext("{WORKING_DIR}/{sample}.bam.gridss.working/{sample}.bam", ".cigar_metrics", ".computesamtags.changes.tsv", ".coverage.blacklist.bed", ".idsv_metrics", ".insert_size_histogram.pdf", ".insert_size_metrics", ".mapq_metrics", ".sv.bam", ".sv.bam.csi", ".tag_metrics") params: extra="--jvmheap 1g", workingdir=WORKING_DIR log: "log/gridss/preprocess/{WORKING_DIR}/{sample}.preprocess.log" threads: 8 wrapper: "v3.0.1/bio/gridss/preprocess" Note that input, output and log file paths can be chosen freely. When running with .. code-block:: bash snakemake --use-conda the software dependencies will be automatically deployed into an isolated environment before execution. Software dependencies --------------------- * ``gridss=2.13.2`` Authors ------- * Christopher Schröder Code ---- .. code-block:: python """Snakemake wrapper for gridss preprocess""" __author__ = "Christopher Schröder" __copyright__ = "Copyright 2020, Christopher Schröder" __email__ = "christopher.schroede@tu-dortmund.de" __license__ = "MIT" from snakemake.shell import shell from os import path # Creating log log = snakemake.log_fmt_shell(stdout=True, stderr=True) # Placeholder for optional parameters extra = snakemake.params.get("extra", "") # Check inputs/arguments. reference = snakemake.input.get("reference") dictionary = snakemake.input.get("dictionary") if not snakemake.params.workingdir: raise ValueError("Please set params.workingdir to provide a working directory.") if not snakemake.input.reference: raise ValueError("Please set input.reference to provide reference genome.") for ending in (".amb", ".ann", ".bwt", ".pac", ".sa"): if not path.exists("{}{}".format(reference, ending)): raise ValueError( "{reference}{ending} missing. Please make sure the reference was properly indexed by bwa.".format( reference=reference, ending=ending ) ) dictionary = path.splitext(reference)[0] + ".dict" if not path.exists(dictionary): raise ValueError( "{dictionary}.dict missing. Please make sure the reference dictionary was properly created. This can be accomplished for example by CreateSequenceDictionary.jar from Picard".format( dictionary=dictionary ) ) shell( "(gridss -s preprocess " # Tool "--reference {reference} " # Reference "--threads {snakemake.threads} " "--workingdir {snakemake.params.workingdir} " "{snakemake.input.bam} " "{extra}) {log}" ) .. |nl| raw:: html