SAMTOOLS FAIDX

https://img.shields.io/github/issues-pr/snakemake/snakemake-wrappers/bio/samtools/faidx?label=version%20update%20pull%20requests

index reference sequence in FASTA format from reference sequence.

URL: http://www.htslib.org/doc/samtools-faidx.html

Example

This wrapper can be used in the following way:

rule samtools_faidx:
    input:
        "{sample}.fa",
    output:
        "out/{sample}.fa.fai",
    log:
        "{sample}.log",
    params:
        extra="",
    wrapper:
        "v5.8.0-3-g915ba34/bio/samtools/faidx"


rule samtools_faidx_named:
    input:
        "{sample}.fa",
    output:
        fai="out/{sample}.named.fa.fai",
    log:
        "{sample}.named.log",
    params:
        extra="",
    wrapper:
        "v5.8.0-3-g915ba34/bio/samtools/faidx"


rule samtools_faidx_bgzip:
    input:
        "{sample}.fa.bgz",
    output:
        fai="out/{sample}.fas.bgz.fai",
        gzi="out/{sample}.fas.bgz.gzi",
    log:
        "{sample}.bzgip.log",
    params:
        extra="",
    wrapper:
        "v5.8.0-3-g915ba34/bio/samtools/faidx"


rule samtools_faidx_region_file:
    input:
        "{sample}.fa",
        fai="idx/{sample}.fa.fai",
        regions="{sample}.regions",
    output:
        "out/{sample}.region_file.fas",
    log:
        "{sample}.region_file.log",
    params:
        extra="--length 5",
    wrapper:
        "v5.8.0-3-g915ba34/bio/samtools/faidx"


rule samtools_faidx_region_array:
    input:
        "{sample}.fa",
        fai="idx/{sample}.fa.fai",
    output:
        "out/{sample}.region_array.fas",
    log:
        "{sample}.region_array.log",
    params:
        region=["ref", "ref2"],
        extra="--length 5",
    wrapper:
        "v5.8.0-3-g915ba34/bio/samtools/faidx"


rule samtools_faidx_bgzip_region:
    input:
        "{sample}.fa.bgz",
        fai="idx/{sample}.fa.bgz.fai",
        gzi="idx/{sample}.fa.bgz.gzi",
    output:
        "out/{sample}.region_bgzip.fas",
    log:
        "{sample}.region_bgzip.log",
    params:
        region="ref",
        extra="--length 5",
    wrapper:
        "v5.8.0-3-g915ba34/bio/samtools/faidx"

Note that input, output and log file paths can be chosen freely.

When running with

snakemake --use-conda

the software dependencies will be automatically deployed into an isolated environment before execution.

Software dependencies

  • samtools=1.21

  • snakemake-wrapper-utils=0.6.2

Input/Output

Input:

  • reference sequence file (.fa)

  • regions: file with regions

  • fai: index for reference file (optional)

  • gzi: index for BGZip’ed reference file (optional)

Output:

  • indexed reference sequence file (.fai)

  • fai: index for reference file (optional)

  • gzi: index for BGZip’ed reference file (optional)

Params

  • region: region to extract from input file (optional)

  • extra: additional program arguments (not -o).

Authors

  • Michael Chambers

  • Filipe G. Vieira

Code

__author__ = "Michael Chambers"
__copyright__ = "Copyright 2019, Michael Chambers"
__email__ = "greenkidneybean@gmail.com"
__license__ = "MIT"


from snakemake.shell import shell
from snakemake_wrapper_utils.samtools import get_samtools_opts

samtools_opts = get_samtools_opts(
    snakemake, parse_threads=True, parse_write_index=False, parse_output_format=False
)
extra = snakemake.params.get("extra", "")
log = snakemake.log_fmt_shell(stdout=True, stderr=True)


# Get regions (if present)
regions = snakemake.input.get("regions", "")
if regions:
    regions = f"--region-file {regions}"

region = snakemake.params.get("region", "")


# Get FAI and GZI files
if region or regions:
    fai = snakemake.input.get("fai", "")
    gzi = snakemake.input.get("gzi", "")
else:
    fai = snakemake.output.get("fai", "")
    gzi = snakemake.output.get("gzi", "")

if fai:
    fai = f"--fai-idx {fai}"
if gzi:
    gzi = f"--gzi-idx {gzi}"


shell(
    "samtools faidx {fai} {gzi} {regions} {samtools_opts} {extra} {snakemake.input[0]} {region:q} {log}"
)