CNV_FACETS

cnv_facets detects somatic copy number variants (CNVs)

URL: https://github.com/dariober/cnv_facets

Example

This wrapper can be used in the following way:

rule test_cnv_facets_bam:
    input:
        tumor="T.sample.bam",
        normal="N.sample.bam",
        vcf="common.sample.vcf.gz",
    output:
        vcf="CNV_bam.vcf.gz",
        cnv="genome_bam.cnv.png",
        hist="cnv_bam.hist.pdf",
        spider="qc_bam.spider.pdf",
    log:
        "logs/cnv_facets_bam.log",
    params:
        extra="",
    wrapper:
        "v3.9.0/bio/cnv_facets"


rule test_cnv_facets_pileup:
    input:
        pileup="pileup.csv.gz",
        vcf="common.sample.vcf.gz",
    output:
        vcf="CNV_pileup.vcf.gz",
        cnv="genome_pileup.cnv.png",
        hist="cnv_pileup.hist.pdf",
        spider="qc_pileup.spider.pdf",
    log:
        "logs/cnv_facets_bam.log",
    params:
        extra="",
    wrapper:
        "v3.9.0/bio/cnv_facets"

Note that input, output and log file paths can be chosen freely.

When running with

snakemake --use-conda

the software dependencies will be automatically deployed into an isolated environment before execution.

Software dependencies

cnv_facets=0.16.0
python=3.8.15

Input/Output

Input:

tumor: Path to tumor aligned reads. (BAM, required if pileup is empty)
normal: Path to normal aligned reads. (BAM, required if pileup is empty)
vcf: Path to common, polymorphic SNPs. (pbgzip VCF)
pileup: Path to pileup variants. (pbgzip CSV, replaces tumor and normal)

Output:

vcf: Path to copy number variants. (pbgzip VCF)
cnv: Path to a summary plot of CNVs across the genome. (PNG)
hist: Path to histograms of the distribution of read depth across all the position in the tumour and normal sample, before and after filtering positions. (PDF)
spider: Path to a diagnostic plot to check how well the copy number fits work (PDF)

Params

extra: Optional parameters given to cnv_facets, besides -t, -n, -vcf and -o.

Authors

Thibault Dayris

Code

#!/usr/bin/env python3
# coding: utf-8

__author__ = "Thibault Dayris"
__copyright__ = "Copyright 2023, Thibault Dayris"
__email__ = "thibault.dayris@gustaveroussy.fr"
__license__ = "MIT"

from snakemake.shell import shell
from tempfile import TemporaryDirectory

extra = snakemake.params.get("extra", "")
log = snakemake.log_fmt_shell(stdout=True, stderr=True, append=True)

# Consider user input datasets
input_data = ""
if all(key in snakemake.input.keys() for key in ["tumor", "normal"]):
    input_data = f" --snp-tumour {snakemake.input.tumor} --snp-normal {snakemake.input.normal}  --snp-vcf {snakemake.input.vcf} "
elif "pileup" in snakemake.input.keys():
    input_data = f" --pileup {snakemake.input.pileup} --snp-vcf {snakemake.input.vcf} "
else:
    raise KeyError(
        "Either provide both `tumor` *and* `normal` bam files, "
        "or a unique `pileup` file."
    )

with TemporaryDirectory() as tempdir:
    prefix = f"{tempdir}/facets_output"

    # Run cnv_facets
    shell(
        "cnv_facets.R "
        "{extra} "
        "--snp-nprocs {snakemake.threads} "
        "{input_data} "
        "-o {prefix} "
        "{log}"
    )

    # Allow user to define all output files
    if snakemake.output.get("vcf"):
        shell("mv --verbose {prefix}.vcf.gz {snakemake.output.vcf} {log}")
        shell("mv --verbose {prefix}.vcf.gz.tbi {snakemake.output.vcf}.tbi {log}")

    if snakemake.output.get("cnv"):
        shell("mv --verbose {prefix}.cnv.png {snakemake.output.cnv} {log}")

    if snakemake.output.get("hist"):
        shell("mv --verbose {prefix}.cov.pdf {snakemake.output.hist} {log}")

    if snakemake.output.get("spider"):
        shell("mv --verbose {prefix}.spider.pdf {snakemake.output.spider} {log}")