SUBREAD FEATURECOUNTS
FeatureCounts assign mapped reads or fragments (paired-end data) to genomic features such as genes, exons and promoters.
URL: http://subread.sourceforge.net/
Example
This wrapper can be used in the following way:
rule feature_counts:
input:
# list of sam or bam files
samples="{sample}.bam",
annotation="annotation.gtf",
# optional input
#chr_names="", # implicitly sets the -A flag
#fasta="genome.fasta" # implicitly sets the -G flag
output:
multiext(
"results/{sample}",
".featureCounts",
".featureCounts.summary",
".featureCounts.jcounts",
),
threads: 2
params:
strand=0, # optional; strandness of the library (0: unstranded [default], 1: stranded, and 2: reversely stranded)
r_path="", # implicitly sets the --Rpath flag
extra="-O --fracOverlap 0.2 -J -p",
log:
"logs/{sample}.log",
wrapper:
"v4.6.0-24-g250dd3e/bio/subread/featurecounts"
Note that input, output and log file paths can be chosen freely.
When running with
snakemake --use-conda
the software dependencies will be automatically deployed into an isolated environment before execution.
Notes
The strand param allows to specify the strandness of the library (0: unstranded, 1: stranded, and 2: reversely stranded)
The extra param allows for additional program arguments.
Software dependencies
subread=2.0.6
Input/Output
Input:
a list of .sam or .bam files
GTF, GFF or SAF annotation file
optional a tab separating file that determines the sorting order and contains the chromosome names in the first column
optional a fasta index file
Output:
Feature counts file including read counts (tab separated)
Summary file including summary statistics (tab separated)
Junction counts file including count number of reads supporting each exon-exon junction (tab separated)
Code
__author__ = "Antonie Vietor"
__copyright__ = "Copyright 2020, Antonie Vietor"
__email__ = "antonie.v@gmx.de"
__license__ = "MIT"
import tempfile
from snakemake.shell import shell
log = snakemake.log_fmt_shell(stdout=True, stderr=True)
extra = snakemake.params.get("extra", "")
# optional input files and directories
strand = snakemake.params.get("strand", 0)
fasta = snakemake.input.get("fasta", "")
if fasta:
fasta = f"-G {fasta}"
chr_names = snakemake.input.get("chr_names", "")
if chr_names:
chr_names = f"-A {chr_names}"
r_path = snakemake.params.get("r_path", "")
if r_path:
r_path = f"--Rpath {r_path}"
with tempfile.TemporaryDirectory() as tmpdir:
shell(
"featureCounts"
" -T {snakemake.threads}"
" -s {strand}"
" -a {snakemake.input.annotation}"
" {fasta}"
" {chr_names}"
" {r_path}"
" {extra}"
" --tmpDir {tmpdir}"
" -o {snakemake.output[0]}"
" {snakemake.input.samples}"
" {log}"
)