TABIX

Query given file with tabix.

URL: https://www.htslib.org/doc/tabix.html#QUERYING_AND_OTHER_OPTIONS

Example

This wrapper can be used in the following way:

rule tabix:
    input:
        ## list the VCF/BCF as the first input
        ## and the index as the second input
        "{prefix}.bed.gz",
        "{prefix}.bed.gz.tbi",
    output:
        "{prefix}.output.bed",
    log:
        "logs/tabix/query/{prefix}.log",
    params:
        region="1",
        extra="",
    wrapper:
        "v1.9.0/bio/tabix/query"

Note that input, output and log file paths can be chosen freely.

When running with

snakemake --use-conda

the software dependencies will be automatically deployed into an isolated environment before execution.

Notes

  • The region param (required) allows to specify region of interest to retrieve.
  • The extra param allows for additional program arguments.

Software dependencies

  • htslib=1.15

Input/Output

Input:

  • Bgzip compressed file (e.g. BED.gz, GFF.gz, or VCF.gz)
  • Tabix index file

Output:

  • Uncompressed subset of the input file from the given region

Authors

  • William Rowell

Code

__author__ = "William Rowell"
__copyright__ = "Copyright 2020, William Rowell"
__email__ = "wrowell@pacb.com"
__license__ = "MIT"

from snakemake.shell import shell

extra = snakemake.params.get("extra", "")
log = snakemake.log_fmt_shell(stdout=False, stderr=True)

shell(
    "tabix {extra} {snakemake.input[0]} {snakemake.params.region} > {snakemake.output} {log}"
)