.. _`bio/gatk/mutect`: GATK MUTECT2 ============ .. image:: https://img.shields.io/github/issues-pr/snakemake/snakemake-wrappers/bio/gatk/mutect?label=version%20update%20pull%20requests :target: https://github.com/snakemake/snakemake-wrappers/pulls?q=is%3Apr+is%3Aopen+label%3Abio/gatk/mutect Call somatic SNVs and indels via local assembly of haplotypes **URL**: https://gatk.broadinstitute.org/hc/en-us/articles/360037593851-Mutect2 Example ------- This wrapper can be used in the following way: .. code-block:: python rule mutect2: input: fasta="genome/genome.fasta", map="mapped/{sample}.bam", output: vcf="variant/{sample}.vcf", message: "Testing Mutect2 with {wildcards.sample}" threads: 1 resources: mem_mb=1024, log: "logs/mutect_{sample}.log", wrapper: "v3.0.1/bio/gatk/mutect" rule mutect2_bam: input: fasta="genome/genome.fasta", map="mapped/{sample}.bam", output: vcf="variant_bam/{sample}.vcf", bam="variant_bam/{sample}.bam", message: "Testing Mutect2 with {wildcards.sample}" threads: 1 resources: mem_mb=1024, log: "logs/mutect_{sample}.log", wrapper: "v3.0.1/bio/gatk/mutect" rule mutect2_complete: input: fasta="genome/genome.fasta", map="mapped/{sample}.bam", intervals="genome/intervals.bed", output: vcf="variant_complete/{sample}.vcf", bam="variant_complete/{sample}.bam", f1r2="counts/{sample}.f1r2.tar.gz", message: "Testing Mutect2 with {wildcards.sample}" threads: 1 resources: mem_mb=1024, log: "logs/mutect_{sample}.log", wrapper: "v3.0.1/bio/gatk/mutect" Note that input, output and log file paths can be chosen freely. When running with .. code-block:: bash snakemake --use-conda the software dependencies will be automatically deployed into an isolated environment before execution. Software dependencies --------------------- * ``gatk4=4.4.0.0`` * ``snakemake-wrapper-utils=0.6.2`` Input/Output ------------ **Input:** * ``map``: Mapped reads (SAM/BAM/CRAM) * ``fasta``: Reference Fasta file * ``intervals``: Optional path to a BED interval file * ``pon``: Optional path to Panel of Normals (flagged as BETA) * ``germline``: Optional path to known germline variants **Output:** * ``vcf``: Path to variant file * ``bam``: Optional path to output bam file * ``f1r2``: Optional path to f1r2 count file Params ------ * ``extra``: Optional parameters for GATK Mutect2 * ``use_parallelgc``: Automatically add "-XX:ParallelGCThreads={snakemake.threads}" to your command line. Set to `True` if your architecture supports ParallelGCThreads. * ``use_omp``: Automatically set `OMP_NUM_THREADS` environment variable. Set to `True` if your java architecture uses OMP threads. * ``java_opts``: allows for additional arguments to be passed to the java compiler (not for `-XmX` or `-Djava.io.tmpdir`, `-XX:ParallelGCThreads`, since they are handled automatically). Authors ------- * Thibault Dayris * Filipe G. Vieira Code ---- .. code-block:: python """Snakemake wrapper for GATK4 Mutect2""" __author__ = "Thibault Dayris" __copyright__ = "Copyright 2019, Dayris Thibault" __email__ = "thibault.dayris@gustaveroussy.fr" __license__ = "MIT" import os import tempfile from snakemake.shell import shell from snakemake.utils import makedirs from snakemake_wrapper_utils.java import get_java_opts log = snakemake.log_fmt_shell(stdout=True, stderr=True) # On non-omp systems, and in case OMP_NUM_THREADS # was not set, define OMP_NUM_THREADS through python if "OMP_NUM_THREADS" not in os.environ.keys() and snakemake.params.get( "use_omp", False ): os.environ["OMP_NUM_THREADS"] = snakemake.threads bam_output = snakemake.output.get("bam", "") if bam_output: bam_output = f"--bam-output {bam_output }" germline_resource = snakemake.input.get("germline", "") if germline_resource: germline_resource = f"--germline-resource {germline_resource}" intervals = snakemake.input.get("intervals", "") if intervals: intervals = f"--intervals {intervals}" f1r2 = snakemake.output.get("f1r2", "") if f1r2: f1r2 = f"--f1r2-tar-gz {f1r2}" pon = snakemake.input.get("pon", "") if pon: pon = f"--panel-of-normals {pon}" extra = snakemake.params.get("extra", "") java_opts = get_java_opts(snakemake) # In case Java execution environment suits GC parallel # calls, these must be given as optional java parameters if snakemake.params.get("use_parallelgc", False): if "UseParallelGC" not in java_opts: java_opts += " -XX:+UseParallelGC " if "ParallelGCThreads" not in java_opts: java_opts += f" -XX:ParallelGCThreads={snakemake.threads}" with tempfile.TemporaryDirectory() as tmpdir: shell( "gatk --java-options '{java_opts}' Mutect2" # Tool and its subprocess " --native-pair-hmm-threads {snakemake.threads}" " --input {snakemake.input.map}" # Path to input mapping file " --reference {snakemake.input.fasta}" # Path to reference fasta file " {f1r2}" # Optional path to output f1r2 count file " {germline_resource}" # Optional path to optional germline resource VCF " {intervals}" # Optional path to optional bed intervals " {pon} " # Optional path to panel of normals " {extra}" # Extra parameters " --tmp-dir {tmpdir}" " --output {snakemake.output.vcf}" # Path to output vcf file " {bam_output}" # Path to output bam file, optional " {log}" # Logging behaviour ) .. |nl| raw:: html