BCFTOOLS FILTER

filter vcf/bcf file.

URL:

Example

This wrapper can be used in the following way:

rule bcf_filter_sample:
    input:
        "{prefix}.bcf",  # input bcf/vcf needs to be first input
        samples="samples.txt",  # other inputs, e.g. sample files, are optional
    output:
        "{prefix}.filter_sample.vcf",
    log:
        "log/{prefix}.filter_sample.vcf.log",
    params:
        filter=lambda w, input: f"--exclude 'GT[@{input.samples}]=\"0/1\"'",
        extra="",
    wrapper:
        "v1.2.1/bio/bcftools/filter"


rule bcf_filter_o_vcf:
    input:
        "{prefix}.bcf",
    output:
        "{prefix}.filter.vcf",
    log:
        "log/{prefix}.filter.vcf.log",
    params:
        filter="-i 'QUAL > 5'",
        extra="",
    wrapper:
        "v1.2.1/bio/bcftools/filter"


rule bcf_filter_o_vcf_gz:
    input:
        "{prefix}.bcf",
    output:
        "{prefix}.filter.vcf.gz",
    log:
        "log/{prefix}.filter.vcf.gz.log",
    params:
        filter="-i 'QUAL > 5'",
        extra="",
    wrapper:
        "v1.2.1/bio/bcftools/filter"


rule bcf_filter_o_bcf:
    input:
        "{prefix}.bcf",
    output:
        "{prefix}.filter.bcf",
    log:
        "log/{prefix}.filter.bcf.log",
    params:
        filter="-i 'QUAL > 5'",
        extra="",
    wrapper:
        "v1.2.1/bio/bcftools/filter"


rule bcf_filter_o_uncompressed_bcf:
    input:
        "{prefix}.bcf",
    output:
        "{prefix}.filter.uncompressed.bcf",
    log:
        "log/{prefix}.filter.uncompressed.bcf.log",
    params:
        uncompressed_bcf=True,
        filter="-i 'QUAL > 5'",
        extra="",
    wrapper:
        "v1.2.1/bio/bcftools/filter"

Note that input, output and log file paths can be chosen freely.

When running with

snakemake --use-conda

the software dependencies will be automatically deployed into an isolated environment before execution.

Software dependencies

  • bcftools==1.12
  • snakemake-wrapper-utils==0.2

Input/Output

Input:

  • VCF/BCF file

Output:

  • Filtered VCF/BCF file

Notes

  • The uncompressed_bcf param allows to specify that a BCF output should be uncompressed (ignored otherwise).
  • The bcftools_use_mem param controls whether to pass the resources.mem_mb to bcftools
  • The extra param allows for additional program arguments (not –threads, `-O/–output-type, -m/–max-mem, or -T/–temp-dir).
  • For more information see, https://samtools.github.io/bcftools/bcftools.html

Authors

  • Patrik Smeds
  • Nikos Tsardakas Renhuldt

Code

__author__ = "Patrik Smeds"
__copyright__ = "Copyright 2021, Patrik Smeds"
__email__ = "patrik.smeds@scilifelab.uu.se"
__license__ = "MIT"


from snakemake.shell import shell
from snakemake_wrapper_utils.bcftools import get_bcftools_opts

bcftools_opts = get_bcftools_opts(snakemake, parse_memory=False)
log = snakemake.log_fmt_shell(stdout=False, stderr=True)

if len(snakemake.output) > 1:
    raise Exception("Only one output file expected, got: " + str(len(snakemake.output)))

filter = snakemake.params.get("filter", "")
extra = snakemake.params.get("extra", "")

shell(
    "bcftools filter {filter} {extra} {snakemake.input[0]} "
    "{bcftools_opts} "
    "-o {snakemake.output[0]} "
    "{log}"
)