BUSTOOLS COUNT#
BUS files can be converted into a barcode-feature matrix
URL: https://github.com/BUStools/bustools#count
Example#
This wrapper can be used in the following way:
rule test_bustools_count:
input:
bus="file.bus",
ecmap="matrix.ec",
txnames="transcripts.txt",
genemap="t2g.txt",
output:
multiext(
"buscount",
".barcodes.txt",
".CUPerCell.txt",
".cu.txt",
".genes.txt",
".hist.txt",
".mtx",
),
threads: 1
params:
extra="",
log:
"bustools.log",
wrapper:
"v3.0.1/bio/bustools/count"
Note that input, output and log file paths can be chosen freely.
When running with
snakemake --use-conda
the software dependencies will be automatically deployed into an isolated environment before execution.
Notes#
When multiple bus files are provided, only one count matrix is returned.
When an output endswith: “.hist.txt”, then –hist parameter is automatically used.
When an output endswith: “.genes.txt”, then –genemap parameter is automatically used.
Software dependencies#
bustools=0.43.1
Input/Output#
Input:
bus: Single bus-file, or List of bus-filesgenemap: Transcript to gene mappingtxnames: List of transcriptsecmap: Equivalence classes for transcripts
Output:
barcodes, equivalence classes, and count matrix
Params#
extra: Optional parameters, besides –output, –ecmap, and –genemap
Code#
#!/usr/bin/env python3
# coding: utf-8
"""Snakemake wrapper for bustools count"""
__author__ = "Thibault Dayris"
__copyright__ = "Copyright 2022, Thibault Dayris"
__email__ = "thibault.dayris@gustaveroussy.fr"
__license__ = "MIT"
from snakemake.shell import shell
from os.path import commonprefix
log = snakemake.log_fmt_shell(stdout=True, stderr=True)
# Get IO files and prefixes
bus_files = snakemake.input["bus"]
if isinstance(bus_files, list):
bus_files = " ".join(bus_files)
out_prefix = commonprefix(snakemake.output)[:-1]
# Fill extra parameters if needed
extra = snakemake.params.get("extra", "")
if any(outfile.endswith(".hist.txt") for outfile in snakemake.output):
if "--hist" not in extra:
extra += " --hist"
if any(outfile.endswith(".genes.txt") for outfile in snakemake.output):
if "--genecounts" not in extra:
extra += " --genecounts"
shell(
"bustools count {extra} "
"--output {out_prefix} "
"--genemap {snakemake.input.genemap} "
"--ecmap {snakemake.input.ecmap} "
"--txnames {snakemake.input.txnames} "
"{bus_files} "
"{log}"
)