BISMARK_METHYLATION_EXTRACTOR#
Call methylation counts from Bismark alignment results (see https://github.com/FelixKrueger/Bismark/blob/master/bismark_methylation_extractor).
Example#
This wrapper can be used in the following way:
rule bismark_methylation_extractor:
input: "bams/{sample}.bam"
output:
mbias_r1="qc/meth/{sample}.M-bias_R1.png",
# Only for PE BAMS:
# mbias_r2="qc/meth/{sample}.M-bias_R2.png",
mbias_report="meth/{sample}.M-bias.txt",
splitting_report="meth/{sample}_splitting_report.txt",
# 1-based start, 1-based end ('inclusive') methylation info: % and counts
methylome_CpG_cov="meth_cpg/{sample}.bismark.cov.gz",
# BedGraph with methylation percentage: 0-based start, end exclusive
methylome_CpG_mlevel_bedGraph="meth_cpg/{sample}.bedGraph.gz",
# Primary output files: methylation status at each read cytosine position: (extremely large)
read_base_meth_state_cpg="meth/CpG_context_{sample}.txt.gz",
# * You could merge CHG, CHH using: --merge_non_CpG
read_base_meth_state_chg="meth/CHG_context_{sample}.txt.gz",
read_base_meth_state_chh="meth/CHH_context_{sample}.txt.gz"
log:
"logs/meth/{sample}.log"
params:
output_dir="meth", # optional output dir
extra="--gzip --comprehensive --bedGraph" # optional params string
wrapper:
"v3.0.4/bio/bismark/bismark_methylation_extractor"
Note that input, output and log file paths can be chosen freely.
When running with
snakemake --use-conda
the software dependencies will be automatically deployed into an isolated environment before execution.
Software dependencies#
bowtie2=2.5.2
bismark=0.24.2
samtools=1.18
perl-gdgraph=1.54
Input/Output#
Input:
Input BAM file aligned by Bismark
Output:
Depends on bismark options passed to params.extra, optional for this wrapper
mbias_report
: M-bias report, *.M-bias.txt (if key is provided, the out file will be renamed to this name)mbias_r1
: M-Bias plot for R1, *.M-bias_R1.png (if key is provided, the out file will be renamed to this name)mbias_r2
: M-Bias plot for R2, *.M-bias_R2.png (if key is provided, the out file will be renamed to this name)splitting_report
: Splitting report, *_splitting_report.txt (if key is provided, the out file will be renamed to this name)methylome_CpG_cov
: Bismark coverage file for CpG context, *.bismark.cov.gz (if key is provided, the out file will be renamed to this name)methylome_CpG_mlevel_bedGraph
: Bismark methylation level track, *.bedGraph.gzread_base_meth_state_cpg
: Per read CpG base methylation info, CpG_context_*.txt.gz (if key is provided, the out file will be renamed to this name)read_base_meth_state_chg
: Per read CpG base methylation info, CHG_context_*.txt.gz (if key is provided, the out file will be renamed to this name)read_base_meth_state_chh
: Per read CpG base methylation info, CHH_context_*.txt.gz (if key is provided, the out file will be renamed to this name)
Params#
output_dir
: Output directory (current dir is used if not specified)ignore
: Number of bases to trim at 5’ end in R1 (see bismark_methylation_extractor documentation), optional argumentignore_3prime
: Number of bases to trim at 3’ end in R1 (see bismark_methylation_extractor documentation), optional argumentignore_r2
: Number of bases to trim at 5’ end in R2 (see bismark_methylation_extractor documentation), optional argumentignore_3prime_r2
: Number of bases to trim at 3’ end in R2 (see bismark_methylation_extractor documentation), optional argumentextra
: Any additional args
Code#
"""Snakemake wrapper for Bismark methylation extractor tool: bismark_methylation_extractor."""
# https://github.com/FelixKrueger/Bismark/blob/master/bismark_methylation_extractor
__author__ = "Roman Chernyatchik"
__copyright__ = "Copyright (c) 2019 JetBrains"
__email__ = "roman.chernyatchik@jetbrains.com"
__license__ = "MIT"
import os
from snakemake.shell import shell
params_extra = snakemake.params.get("extra", "")
cmdline_args = ["bismark_methylation_extractor {params_extra}"]
# output dir
output_dir = snakemake.params.get("output_dir", "")
if output_dir:
cmdline_args.append("-o {output_dir:q}")
# trimming options
trimming_options = [
"ignore", # meth_bias_r1_5end
"ignore_3prime", # meth_bias_r1_3end
"ignore_r2", # meth_bias_r2_5end
"ignore_3prime_r2", # meth_bias_r2_3end
]
for key in trimming_options:
value = snakemake.params.get(key, None)
if value:
cmdline_args.append("--{} {}".format(key, value))
# Input
cmdline_args.append("{snakemake.input}")
# log
log = snakemake.log_fmt_shell(stdout=True, stderr=True)
cmdline_args.append("{log}")
# run
shell(" ".join(cmdline_args))
key2prefix_suffix = [
("mbias_report", ("", ".M-bias.txt")),
("mbias_r1", ("", ".M-bias_R1.png")),
("mbias_r2", ("", ".M-bias_R2.png")),
("splitting_report", ("", "_splitting_report.txt")),
("methylome_CpG_cov", ("", ".bismark.cov.gz")),
("methylome_CpG_mlevel_bedGraph", ("", ".bedGraph.gz")),
("read_base_meth_state_cpg", ("CpG_context_", ".txt.gz")),
("read_base_meth_state_chg", ("CHG_context_", ".txt.gz")),
("read_base_meth_state_chh", ("CHH_context_", ".txt.gz")),
]
log_append = snakemake.log_fmt_shell(stdout=True, stderr=True, append=True)
for key, (prefix, suffix) in key2prefix_suffix:
exp_path = snakemake.output.get(key, None)
if exp_path:
if len(snakemake.input) != 1:
raise ValueError(
"bismark/bismark_methylation_extractor: Error: only one BAM file is"
" expected in input, but was <{}>".format(snakemake.input)
)
bam_file = snakemake.input[0]
bam_name = os.path.basename(bam_file)
bam_wo_ext = os.path.splitext(bam_name)[0]
actual_path = os.path.join(output_dir, prefix + bam_wo_ext + suffix)
if exp_path != actual_path:
shell("mv {actual_path:q} {exp_path:q} {log_append}")