FREEBAYES
Call small genomic variants with freebayes.
URL: https://github.com/freebayes/freebayes
Example
This wrapper can be used in the following way:
rule freebayes:
input:
alns="mapped/{sample}.bam",
idxs="mapped/{sample}.bam.bai",
ref="genome.fasta",
output:
vcf = "calls/{sample}.vcf",
log:
"logs/freebayes/{sample}.log",
params:
normalize="-a",
threads: 2
resources:
mem_mb=1024,
wrapper:
"v4.6.0/bio/freebayes"
rule freebayes_bcf:
input:
alns="mapped/{sample}.bam",
ref="genome.fasta",
output:
bcf="calls/{sample}.bcf",
log:
"logs/freebayes/{sample}.bcf.log",
threads: 2
resources:
mem_mb=1024,
wrapper:
"v4.6.0/bio/freebayes"
rule freebayes_bed:
input:
alns="mapped/{sample}.bam",
ref="genome.fasta",
regions="regions.bed",
output:
vcf="calls/{sample}.vcf.gz",
log:
"logs/freebayes/{sample}.bed.log",
params:
chunksize=50000,
threads: 2
resources:
mem_mb=1024,
wrapper:
"v4.6.0/bio/freebayes"
Note that input, output and log file paths can be chosen freely.
When running with
snakemake --use-conda
the software dependencies will be automatically deployed into an isolated environment before execution.
Software dependencies
freebayes=1.3.8
bcftools=1.21
vcflib=1.0.10
htslib=1.21
parallel=20240722
bedtools=2.31.1
sed=4.8
snakemake-wrapper-utils=0.6.2
Input/Output
Input:
SAM/BAM/CRAM file(s)
reference genome
Output:
VCF/VCF.gz/BCF file
Params
extra
: additional arguments for freebayesnormalize
: use bcftools norm to normalize indels (one of -a, -f, -m, -D or -d must be used)chunkzise
: reference genome chunk size for parallelization (default 100000)
Code
__author__ = "Johannes Köster, Felix Mölder, Christopher Schröder"
__copyright__ = "Copyright 2017, Johannes Köster"
__email__ = "johannes.koester@protonmail.com, felix.moelder@uni-due.de"
__license__ = "MIT"
from snakemake.shell import shell
from tempfile import TemporaryDirectory
from snakemake_wrapper_utils.bcftools import get_bcftools_opts
log = snakemake.log_fmt_shell(stdout=False, stderr=True)
extra = snakemake.params.get("extra", "")
bcftools_sort_opts = get_bcftools_opts(
snakemake,
parse_threads=False,
parse_ref=False,
parse_regions=False,
parse_samples=False,
parse_targets=False,
parse_output=False,
parse_output_format=False,
)
pipe = ""
norm_params = snakemake.params.get("normalize")
if norm_params:
bcftools_norm_opts = get_bcftools_opts(
snakemake, parse_regions=False, parse_targets=False, parse_memory=False
)
pipe = f"bcftools norm {bcftools_norm_opts} {norm_params}"
else:
bcftools_view_opts = get_bcftools_opts(
snakemake,
parse_ref=False,
parse_regions=False,
parse_targets=False,
parse_memory=False,
)
pipe = f"bcftools view {bcftools_view_opts}"
if snakemake.threads == 1:
freebayes = "freebayes"
else:
chunksize = snakemake.params.get("chunksize", 100000)
regions = f"<(fasta_generate_regions.py {snakemake.input.ref}.fai {chunksize})"
if snakemake.input.get("regions"):
regions = (
"<(bedtools intersect -a "
+ r"<(sed 's/:\([0-9]*\)-\([0-9]*\)$/\t\1\t\2/' "
+ f"{regions}) -b {snakemake.input.regions} | "
+ r"sed 's/\t\([0-9]*\)\t\([0-9]*\)$/:\1-\2/')"
)
freebayes = f"freebayes-parallel {regions} {snakemake.threads}"
with TemporaryDirectory() as tempdir:
shell(
"({freebayes}"
" --fasta-reference {snakemake.input.ref}"
" {extra}"
" {snakemake.input.alns}"
" | bcftools sort {bcftools_sort_opts} --temp-dir {tempdir}"
" | {pipe}"
") {log}"
)