.. _`bio/cnvkit/batch`: CNVKIT BATCH ============ .. image:: https://img.shields.io/github/issues-pr/snakemake/snakemake-wrappers/bio/cnvkit/batch?label=version%20update%20pull%20requests :target: https://github.com/snakemake/snakemake-wrappers/pulls?q=is%3Apr+is%3Aopen+label%3Abio/cnvkit/batch Run the CNVkit pipeline on one or more BAM files to call cnv alterations. Or use it to generate a new reference file **URL**: https://cnvkit.readthedocs.io/en/stable/pipeline.html Example ------- This wrapper can be used in the following way: .. code-block:: python rule cnvkit_batch_build_reference: input: bam=["mapped/a.bam"], bai=["mapped/a.bam.bai"], fasta="ref/ref.fa", antitarget="ref/a.antitarget.bed", target="ref/a.target.bed", output: reference="cnvkit/reference.cnn", log: "cnvkit/a.reference.cnn.log", params: method='hybrid', # optional extra="--target-avg-size 10", # optional resources: mem_mb=1024, wrapper: "v3.0.1/bio/cnvkit/batch" rule cnvkit_batch: input: bam=["mapped/a.bam"], bai=["mapped/a.bam.bai"], reference="ref/reference.cnn", output: antitarget_coverage="cnvkit/a.antitargetcoverage.cnn", bins="cnvkit/a.bintest.cns", regions="cnvkit/a.cnr", segments="cnvkit/a.cns", segments_called="cnvkit/a.call.cns", target_coverage="cnvkit/a.targetcoverage.cnn", log: "cnvkit/a.antitargetcoverage.cnn.log", params: method='hybrid', # optional extra="", # optional resources: mem_mb=1024, wrapper: "v3.0.1/bio/cnvkit/batch" Note that input, output and log file paths can be chosen freely. When running with .. code-block:: bash snakemake --use-conda the software dependencies will be automatically deployed into an isolated environment before execution. Software dependencies --------------------- * ``cnvkit=0.9.10`` Input/Output ------------ **Input:** * ``bam``: one or more bam files * ``reference``: copy reference file, when calling samples * ``fasta``: reference gneome, when building new reference * ``antitarget``: bed antitarget file, when building new reference * ``target``: bed target file, when building new reference * ``mappability``: mappability file, when building new reference **Output:** * a set of cnvkit generated files, multiple types of cns, cnr and cnn files * or a new reference file Params ------ * ``method``: Optional, if no value is set cnvkits default value will be use * ``extra``: Any additional arguments to pass Authors ------- * Patrik Smeds Code ---- .. code-block:: python __author__ = "Patrik Smeds" __copyright__ = "Copyright 2023, Patrik Smeds" __email__ = "patrik.smeds@scilifelab.uu.se" __license__ = "MIT" import logging from os import listdir from os.path import basename from os.path import dirname from os.path import join from tempfile import TemporaryDirectory import shutil from snakemake.shell import shell log = snakemake.log_fmt_shell(stdout=False, stderr=True) input_bam_files = f"{snakemake.input.bam}" target = "" antitarget = "" ref_fasta = "" reference_cnv = "" mappability = "" create_reference = snakemake.output.get("reference", False) if create_reference: input_bam_files = f"-n {input_bam_files}" ref_fasta = f"-f {snakemake.input.fasta}" target = f"-t {snakemake.input.target}" antitarget = f"-a {snakemake.input.antitarget}" if "mappability" in snakemake.input: mappability = f"-g {snakemake.input.mappability}" if len(snakemake.output) > 1: exception_message = ( "Only one output expected when creating a reference, multiple output " f"files defined as output: [{snakemake.output}]" ) raise Exception(exception_message) else: reference_cnv = f"-r {snakemake.input.reference}" method = snakemake.params.get("method", "hybrid") if method: method = f"-m {method}" extra = snakemake.params.get("extra", "") with TemporaryDirectory() as tmpdirname: if create_reference: output = f"--output-reference {join(tmpdirname, 'reference.cnn')}" output_list = [snakemake.output.reference] else: output_list = [value for key, value in snakemake.output.items()] output = f"-d {tmpdirname} " shell( "(cnvkit.py batch {input_bam_files} " "{reference_cnv} " "{method} " "{ref_fasta} " "{target} " "{antitarget} " "{output} " "{extra}) {log}" ) for output_file in output_list: filename = basename(output_file) parent = dirname(output_file) try: shutil.copy2(join(tmpdirname, filename), output_file) except FileNotFoundError as e: temp_files = listdir(tmpdirname) logging.error( f"Couldn't locate file {basename} possible files are {[basename(f) for f in temp_files]}" ) raise e .. |nl| raw:: html