SRA-TOOLS FASTERQ-DUMP¶
Download FASTQ files from SRA.
Example¶
This wrapper can be used in the following way:
rule get_fastq_pe:
output:
# the wildcard name must be accession, pointing to an SRA number
"data/pe/{accession}_1.fastq",
"data/pe/{accession}_2.fastq",
log:
"logs/pe/{accession}.log"
params:
extra="--skip-technical"
threads: 6 # defaults to 6
wrapper:
"v1.31.1-39-gb5b9878a/bio/sra-tools/fasterq-dump"
rule get_fastq_pe_gz:
output:
# the wildcard name must be accession, pointing to an SRA number
"data/pe/{accession}_1.fastq.gz",
"data/pe/{accession}_2.fastq.gz",
log:
"logs/pe/{accession}.gz.log"
params:
extra="--skip-technical"
threads: 6 # defaults to 6
wrapper:
"v1.31.1-39-gb5b9878a/bio/sra-tools/fasterq-dump"
rule get_fastq_pe_bz2:
output:
# the wildcard name must be accession, pointing to an SRA number
"data/pe/{accession}_1.fastq.bz2",
"data/pe/{accession}_2.fastq.bz2",
log:
"logs/pe/{accession}.bz2.log"
params:
extra="--skip-technical"
threads: 6 # defaults to 6
wrapper:
"v1.31.1-39-gb5b9878a/bio/sra-tools/fasterq-dump"
rule get_fastq_se:
output:
"data/se/{accession}.fastq"
log:
"logs/se/{accession}.log"
params:
extra="--skip-technical"
threads: 6
wrapper:
"v1.31.1-39-gb5b9878a/bio/sra-tools/fasterq-dump"
rule get_fastq_se_gz:
output:
"data/se/{accession}.fastq.gz"
log:
"logs/se/{accession}.gz.log"
params:
extra="--skip-technical"
threads: 6
wrapper:
"v1.31.1-39-gb5b9878a/bio/sra-tools/fasterq-dump"
rule get_fastq_se_bz2:
output:
"data/se/{accession}.fastq.bz2"
log:
"logs/se/{accession}.bz2.log"
params:
extra="--skip-technical"
threads: 6
wrapper:
"v1.31.1-39-gb5b9878a/bio/sra-tools/fasterq-dump"
Note that input, output and log file paths can be chosen freely.
When running with
snakemake --use-conda
the software dependencies will be automatically deployed into an isolated environment before execution.
Notes¶
- The output format is automatically detected and, if needed, files compressed with either gzip or bzip2.
- Currently only supports PE samples
- The extra param alllows for additional program arguments.
- More information in, https://github.com/ncbi/sra-tools
Software dependencies¶
sra-tools=3.0.5
pigz=2.6
pbzip2=1.1.13
snakemake-wrapper-utils=0.5.3
Authors¶
- Johannes Köster
- Derek Croote
- Filipe G. Vieira
Code¶
__author__ = "Johannes Köster, Derek Croote"
__copyright__ = "Copyright 2020, Johannes Köster"
__email__ = "johannes.koester@uni-due.de"
__license__ = "MIT"
import os
import tempfile
from snakemake.shell import shell
from snakemake_wrapper_utils.snakemake import get_mem
log = snakemake.log_fmt_shell(stdout=True, stderr=True)
extra = snakemake.params.get("extra", "")
# Parse memory
mem_mb = get_mem(snakemake, "MiB")
# Outdir
outdir = os.path.dirname(snakemake.output[0])
if outdir:
outdir = f"--outdir {outdir}"
# Output compression
compress = ""
mem = f"-m{mem_mb}" if mem_mb else ""
for output in snakemake.output:
out_name, out_ext = os.path.splitext(output)
if out_ext == ".gz":
compress += f"pigz -p {snakemake.threads} {out_name}; "
elif out_ext == ".bz2":
compress += f"pbzip2 -p{snakemake.threads} {mem} {out_name}; "
with tempfile.TemporaryDirectory() as tmpdir:
mem = f"--mem {mem_mb}M" if mem_mb else ""
shell(
"(fasterq-dump --temp {tmpdir} --threads {snakemake.threads} {mem} "
"{extra} {outdir} {snakemake.wildcards.accession}; "
"{compress}"
") {log}"
)