FGBIO FILTERCONSENSUSREADS

Filters consensus reads generated by CallMolecularConsensusReads or CallDuplexConsensusReads.

Example

This wrapper can be used in the following way:

rule FilterConsensusReads:
    input:
        "mapped/{sample}.bam"
    output:
        "mapped/{sample}.filtered.bam"
    params:
        extra="",
        min_base_quality=2,
        min_reads=[2, 2, 2],
        ref="genome.fasta"
    log:
        "logs/fgbio/filterconsensusreads/{sample}.log"
    threads: 1
    wrapper:
        "v3.8.0/bio/fgbio/filterconsensusreads"

Note that input, output and log file paths can be chosen freely.

When running with

snakemake --use-conda

the software dependencies will be automatically deployed into an isolated environment before execution.

Notes

min_base_quality: a single value (Int). Mask (make N) consensus bases with quality less than this threshold. (default: 5)
min_reads: n array of Ints, max length 3, min length 1. Number of reads that need to support a UMI. For filtering bam files processed with CallMolecularConsensusReads one value is required. 3 values can be provided for bam files processed with CallDuplexConsensusReads, if fewer than 3 are provided the last value will be repeated, the first value is for the final consensus sequence and the two last for each strands consensus.
For more information see, http://fulcrumgenomics.github.io/fgbio/tools/latest/FilterConsensusReads.html

Software dependencies

fgbio=2.2.1

Input/Output

Input:

bam file
vcf files
reference genome

Output:

filtered bam file

Authors

Patrik Smeds

Code

__author__ = "Patrik Smeds"
__copyright__ = "Copyright 2019, Patrik Smeds"
__email__ = "patrik.smeds@gmail.com"
__license__ = "MIT"


from snakemake.shell import shell

shell.executable("bash")

log = snakemake.log_fmt_shell(stdout=False, stderr=True)

extra_params = snakemake.params.get("extra", "")

min_base_quality = snakemake.params.get("min_base_quality", None)
if not isinstance(min_base_quality, int):
    raise ValueError("min_base_quality needs to be provided as an Int!")

min_reads = snakemake.params.get("min_reads", None)
if not isinstance(min_reads, list) or not (1 <= len(min_reads) <= 3):
    raise ValueError(
        "min_reads needs to be provided as list of Ints, min length 1, max length 3!"
    )

ref = snakemake.params.get("ref", None)
if ref is None:
    raise ValueError("A reference needs to be provided!")

bam_input = snakemake.input[0]

if not isinstance(bam_input, str) and len(snakemake.input) != 1:
    raise ValueError("Input bam should be one bam file: " + str(bam_input) + "!")

bam_output = snakemake.output[0]

if not isinstance(bam_output, str) and len(snakemake.output) != 1:
    raise ValueError("Output should be one bam file: " + str(bam_output) + "!")

shell(
    "fgbio FilterConsensusReads"
    " -i {bam_input}"
    " -o {bam_output}"
    " -r {ref}"
    " --min-reads {min_reads}"
    " --min-base-quality {min_base_quality}"
    " {extra_params}"
    " {log}"
)