PYFAIDX

https://img.shields.io/badge/wrapper_version-v9.4.2-10785b https://img.shields.io/github/issues-pr/snakemake/snakemake-wrappers/bio/pyfaidx?label=version%20update%20pull%20requests&color=1cb481

Pythonic indexing, retrieval, and in-place modification of FASTA files using a samtools compatible index.

URL: https://github.com/mdshw5/pyfaidx?tab=readme-ov-file#cli-script-faidx

Example

This wrapper can be used in the following way:

rule test_pyfaidx_out_fasta:
    input:
        fasta="sequence.fasta",
        bed="interval.bed",
    output:
        "retrieved.fasta",
    log:
        "test_pyfaidx.log",
    params:
        extra="",
        regions="",
    wrapper:
        "v9.4.2/bio/pyfaidx"


rule test_pyfaidx_index_fasta:
    input:
        fasta="sequence.fasta",
        bed="interval.bed",
    output:
        "sequence.fasta.fai",
    log:
        "test_pyfaidx_index_fasta.log",
    params:
        extra="",
        regions="",
    wrapper:
        "v9.4.2/bio/pyfaidx"


rule test_pyfaidx_out_sizes:
    input:
        fasta="sequence.fasta",
        bed="interval.bed",
    output:
        "retrieved.chrom",
    params:
        extra="",
        regions="",
    log:
        "test_pyfaidx_out_sizes.log",
    wrapper:
        "v9.4.2/bio/pyfaidx"


rule test_pyfaidx_out_bed:
    input:
        fasta="sequence.fasta",
        bed="interval.bed",
    output:
        "retrieved.bed",
    params:
        extra="",
        regions="",
    log:
        "test_pyfaidx_out_bed.log",
    wrapper:
        "v9.4.2/bio/pyfaidx"


rule test_pyfaidx_fetch_regions:
    input:
        #bed="interval.bed",
        fasta="sequence.fasta",
    output:
        "regions.fa",
    params:
        extra="",
        regions="seq1",
    log:
        "test_pyfaidx_fetch_regions.log",
    wrapper:
        "v9.4.2/bio/pyfaidx"


rule test_pyfaidx_fetch_list_regions:
    input:
        #bed="interval.bed",
        fasta="sequence.fasta",
    output:
        "list_regions.fa",
    params:
        extra="",
        regions=["seq1", "seq2"],
    log:
        "test_pyfaidx_fetch_list_regions.log",
    wrapper:
        "v9.4.2/bio/pyfaidx"

Note that input, output and log file paths can be chosen freely.

When running with

snakemake --use-conda

the software dependencies will be automatically deployed into an isolated environment before execution.

Notes

The –transform parameter is automatically inferred from output file path. This tool automatically creates a fasta index if the output file is fasta formatted. If no index exists alongside with the input fasta file, then it will be created automatically.

Software dependencies

  • pyfaidx=0.9.0.4

  • snakemake-wrapper-utils=0.8.0

Input/Output

Input:

  • fasta: Path to a sequence fasta file

  • bed: Path to BED intervals (optional)

Output:

  • Path to the modified sequences/intervals

Params

  • extra: Optional parameters besides –transform, –bed and –out.

  • regions: Optional region, or list of regions to retrieve from fasta file

Authors

  • Thibault Dayris

Code

# coding: utf-8

"""Snakemake-wrapper for pyfaidx"""

__author__ = "Thibault Dayris"
__copyright__ = "Copyright 2025, Thibault Dayris"
__email__ = "thibault.dayris@gustaveroussy.fr"
__license__ = "MIT"

from snakemake_wrapper_utils.snakemake import get_format
from snakemake.shell import shell

extra = snakemake.params.get("extra", "")
log = snakemake.log_fmt_shell(stdout=True, stderr=True, append=False)

bed = snakemake.input.get("bed", "")
if bed:
    extra += f" --bed {bed}"

out = str(snakemake.output[0])
fmt = get_format(out)
if fmt == "fai":
    out = ""
elif fmt == "fasta":
    out = f"--out {out}"
elif fmt == "bed":
    out = f"--out {out} --transform bed"
elif fmt == "chrom":
    out = f"--out {out} --transform chromsizes"
elif fmt == "nuc":
    out = f"--out {out} --transform nucleotide"
else:
    raise ValueError(f"invalid output file format: {out}")

regions = snakemake.params.get("regions", "")
shell("faidx {extra} {out} {snakemake.input.fasta} {regions} {log}")