VG AUTOINDEX GIRAFFE
Creates index files for mapping reads via vg giraffe.
URL: https://github.com/vgteam/vg
Example
This wrapper can be used in the following way:
rule vg_autoindex_giraffe:
input:
ref="{genome}.fasta",
# Optional vcf file containing haplotypes for graph creation
#vcf="",
output:
multiext("resources/{genome}", ".dist", ".min", ".giraffe.gbz"),
log:
"logs/autoindex/{genome}.log",
params:
extra="",
threads: 8
wrapper:
"v5.5.2-17-g33d5b76/bio/vg/autoindex"
rule vg_autoindex_map:
input:
ref="{genome}.fasta",
# Optional vcf file containing haplotypes for graph creation
#vcf="",
output:
multiext("resources/{genome}", ".xg", ".gcsa", ".gcsa.lcp"),
log:
"logs/autoindex/{genome}.log",
params:
extra="",
threads: 8
wrapper:
"v5.5.2-17-g33d5b76/bio/vg/autoindex"
Note that input, output and log file paths can be chosen freely.
When running with
snakemake --use-conda
the software dependencies will be automatically deployed into an isolated environment before execution.
Notes
The extra param allows for additional arguments for vg autoindex.
Software dependencies
vg=1.61.0
Input/Output
Input:
ref
: FASTA referencevcf
: haplotypes in vcf format (optional)
Output:
.giraffe.gbz, .dist and .min-file
Code
__author__ = "Felix Mölder"
__copyright__ = "Copyright 2024, Felix Mölder"
__email__ = "felix.moelder@uk-essen.de"
__license__ = "MIT"
from snakemake.shell import shell
from os import path
# Extract arguments.
extra = snakemake.params.get("extra", "")
log = snakemake.log_fmt_shell(stdout=False, stderr=True)
prefix_path = path.commonprefix(snakemake.output).rstrip(".")
vcf_cmd = f"-v {snakemake.input.vcf}" if snakemake.input.get("vcf") else ""
def contains_ext(extensions):
return all(
any(out_file.endswith(ext) for out_file in snakemake.output)
for ext in extensions
)
if any(map(lambda x: "giraffe" in x, snakemake.output)):
workflow_cmd = "giraffe"
elif contains_ext([".xg", ".gcsa.lcp", ".gcsa"]):
workflow_cmd = "map"
else:
raise ValueError(
"Output files do not match any supported indexing workflows. Currently only map and giraffe are supported."
)
shell(
"(vg autoindex"
" -w {workflow_cmd}"
" -p {prefix_path}"
" -r {snakemake.input.ref}"
" {vcf_cmd}"
" -t {snakemake.threads}"
" {extra}) {log}"
)