CNV_FACETS
cnv_facets detects somatic copy number variants (CNVs)
URL: https://github.com/dariober/cnv_facets
Example
This wrapper can be used in the following way:
rule test_cnv_facets_bam:
input:
tumor="T.sample.bam",
normal="N.sample.bam",
vcf="common.sample.vcf.gz",
output:
vcf="CNV_bam.vcf.gz",
cnv="genome_bam.cnv.png",
hist="cnv_bam.hist.pdf",
spider="qc_bam.spider.pdf",
log:
"logs/cnv_facets_bam.log",
params:
extra="",
wrapper:
"v5.0.0/bio/cnv_facets"
rule test_cnv_facets_pileup:
input:
pileup="pileup.csv.gz",
vcf="common.sample.vcf.gz",
output:
vcf="CNV_pileup.vcf.gz",
cnv="genome_pileup.cnv.png",
hist="cnv_pileup.hist.pdf",
spider="qc_pileup.spider.pdf",
log:
"logs/cnv_facets_bam.log",
params:
extra="",
wrapper:
"v5.0.0/bio/cnv_facets"
Note that input, output and log file paths can be chosen freely.
When running with
snakemake --use-conda
the software dependencies will be automatically deployed into an isolated environment before execution.
Software dependencies
cnv_facets=0.16.1
Input/Output
Input:
tumor
: Path to tumor aligned reads. (BAM, required if pileup is empty)normal
: Path to normal aligned reads. (BAM, required if pileup is empty)vcf
: Path to common, polymorphic SNPs. (pbgzip VCF)pileup
: Path to pileup variants. (pbgzip CSV, replaces tumor and normal)
Output:
vcf
: Path to copy number variants. (pbgzip VCF)cnv
: Path to a summary plot of CNVs across the genome. (PNG)hist
: Path to histograms of the distribution of read depth across all the position in the tumour and normal sample, before and after filtering positions. (PDF)spider
: Path to a diagnostic plot to check how well the copy number fits work (PDF)
Params
extra
: Optional parameters given to cnv_facets, besides -t, -n, -vcf and -o.
Code
#!/usr/bin/env python3
# coding: utf-8
__author__ = "Thibault Dayris"
__copyright__ = "Copyright 2023, Thibault Dayris"
__email__ = "thibault.dayris@gustaveroussy.fr"
__license__ = "MIT"
from snakemake.shell import shell
from tempfile import TemporaryDirectory
extra = snakemake.params.get("extra", "")
log = snakemake.log_fmt_shell(stdout=True, stderr=True, append=True)
# Consider user input datasets
input_data = ""
if all(key in snakemake.input.keys() for key in ["tumor", "normal"]):
input_data = f" --snp-tumour {snakemake.input.tumor} --snp-normal {snakemake.input.normal} --snp-vcf {snakemake.input.vcf} "
elif "pileup" in snakemake.input.keys():
input_data = f" --pileup {snakemake.input.pileup} --snp-vcf {snakemake.input.vcf} "
else:
raise KeyError(
"Either provide both `tumor` *and* `normal` bam files, "
"or a unique `pileup` file."
)
with TemporaryDirectory() as tempdir:
prefix = f"{tempdir}/facets_output"
# Run cnv_facets
shell(
"cnv_facets.R "
"{extra} "
"--snp-nprocs {snakemake.threads} "
"{input_data} "
"-o {prefix} "
"{log}"
)
# Allow user to define all output files
if snakemake.output.get("vcf"):
shell("mv --verbose {prefix}.vcf.gz {snakemake.output.vcf} {log}")
shell("mv --verbose {prefix}.vcf.gz.tbi {snakemake.output.vcf}.tbi {log}")
if snakemake.output.get("cnv"):
shell("mv --verbose {prefix}.cnv.png {snakemake.output.cnv} {log}")
if snakemake.output.get("hist"):
shell("mv --verbose {prefix}.cov.pdf {snakemake.output.hist} {log}")
if snakemake.output.get("spider"):
shell("mv --verbose {prefix}.spider.pdf {snakemake.output.spider} {log}")