BUSTOOLS COUNT

https://img.shields.io/github/issues-pr/snakemake/snakemake-wrappers/bio/bustools/count?label=version%20update%20pull%20requests

BUS files can be converted into a barcode-feature matrix

URL: https://github.com/BUStools/bustools#count

Example

This wrapper can be used in the following way:

rule test_bustools_count:
    input:
        bus="file.bus",
        ecmap="matrix.ec",
        txnames="transcripts.txt",
        genemap="t2g.txt",
    output:
        multiext(
            "buscount",
            ".barcodes.txt",
            ".CUPerCell.txt",
            ".cu.txt",
            ".genes.txt",
            ".hist.txt",
            ".mtx",
        ),
    threads: 1
    params:
        extra="",
    log:
        "bustools.log",
    wrapper:
        "v1.20.0-15-g28df43c2/bio/bustools/count"

Note that input, output and log file paths can be chosen freely.

When running with

snakemake --use-conda

the software dependencies will be automatically deployed into an isolated environment before execution.

Notes

When multiple bus files are provided, only one count matrix is returned.

When an output endswith: “.hist.txt”, then –hist parameter is automatically used.

When an output endswith: “.genes.txt”, then –genemap parameter is automatically used.

Software dependencies

  • bustools=0.41.0

Input/Output

Input:

  • bus: Single bus-file, or List of bus-files
  • genemap: Transcript to gene mapping
  • txnames: List of transcripts
  • ecmap: Equivalence classes for transcripts

Output:

  • barcodes, equivalence classes, and count matrix

Params

  • extra: Optional parameters, besides –output, –ecmap, and –genemap

Authors

Code

#!/usr/bin/env python3
# coding: utf-8

"""Snakemake wrapper for bustools count"""

__author__ = "Thibault Dayris"
__copyright__ = "Copyright 2022, Thibault Dayris"
__email__ = "thibault.dayris@gustaveroussy.fr"
__license__ = "MIT"

from snakemake.shell import shell
from os.path import commonprefix

log = snakemake.log_fmt_shell(stdout=True, stderr=True)

# Get IO files and prefixes
bus_files = snakemake.input["bus"]
if isinstance(bus_files, list):
    bus_files = " ".join(bus_files)

out_prefix = commonprefix(snakemake.output)[:-1]

# Fill extra parameters if needed
extra = snakemake.params.get("extra", "")
if any(outfile.endswith(".hist.txt") for outfile in snakemake.output):
    if "--hist" not in extra:
        extra += " --hist"

if any(outfile.endswith(".genes.txt") for outfile in snakemake.output):
    if "--genecounts" not in extra:
        extra += " --genecounts"

shell(
    "bustools count {extra} "
    "--output {out_prefix} "
    "--genemap {snakemake.input.genemap} "
    "--ecmap {snakemake.input.ecmap} "
    "--txnames {snakemake.input.txnames} "
    "{bus_files} "
    "{log}"
)