BEDTOOLS MERGE

Merge entries in one or multiple BED/BAM/VCF/GFF files with bedtools.

Software dependencies

  • bedtools =2.29.0

Example

This wrapper can be used in the following way:

rule bedtools_merge:
    input:
        # Multiple bed-files can be added as list
        "A.bed"
    output:
        "A.merged.bed"
    params:
        ## Add optional parameters
        extra="-c 1 -o count" ## In this example, we want to count how many input lines we merged per output line
    log:
        "logs/merge/A.log"
    wrapper:
        "0.67.0/bio/bedtools/merge"

Note that input, output and log file paths can be chosen freely. When running with

snakemake --use-conda

the software dependencies will be automatically deployed into an isolated environment before execution.

Authors

  • Jan Forster

Code

__author__ = "Jan Forster, Felix Mölder"
__copyright__ = "Copyright 2019, Jan Forster"
__email__ = "j.forster@dkfz.de, felix.moelder@uni-due.de"
__license__ = "MIT"

from snakemake.shell import shell

## Extract arguments
extra = snakemake.params.get("extra", "")

log = snakemake.log_fmt_shell(stdout=True, stderr=True)
if len(snakemake.input) > 1:
    if all(f.endswith(".gz") for f in snakemake.input):
        cat = "zcat"
    elif all(not f.endswith(".gz") for f in snakemake.input):
        cat = "cat"
    else:
        raise ValueError("Input files must be all compressed or uncompressed.")
    shell(
        "({cat} {snakemake.input} | "
        "sort -k1,1 -k2,2n | "
        "bedtools merge {extra} "
        "-i stdin > {snakemake.output}) "
        " {log}"
    )
else:
    shell(
        "( bedtools merge"
        " {extra}"
        " -i {snakemake.input}"
        " > {snakemake.output})"
        " {log}"
    )