.. _`bio/bowtie2/align`: BOWTIE2 ======= .. image:: https://img.shields.io/github/issues-pr/snakemake/snakemake-wrappers/bio/bowtie2/align?label=version%20update%20pull%20requests :target: https://github.com/snakemake/snakemake-wrappers/pulls?q=is%3Apr+is%3Aopen+label%3Abio/bowtie2/align Map reads with bowtie2. **URL**: http://bowtie-bio.sourceforge.net/bowtie2/manual.shtml Example ------- This wrapper can be used in the following way: .. code-block:: python rule test_bowtie2: input: sample=["reads/{sample}.1.fastq", "reads/{sample}.2.fastq"], idx=multiext( "index/genome", ".1.bt2", ".2.bt2", ".3.bt2", ".4.bt2", ".rev.1.bt2", ".rev.2.bt2", ), # ref="genome.fasta", #Required for CRAM output output: "mapped/{sample}.bam", # idx="", # metrics="", # unaligned="", # unpaired="", # unconcordant="", # concordant="", log: "logs/bowtie2/{sample}.log", params: extra="", # optional parameters threads: 8 # Use at least two threads wrapper: "v3.0.4/bio/bowtie2/align" use rule test_bowtie2 as test_bowtie2_se_gz with: input: sample=["reads/{sample}.1.fastq.gz"], idx=multiext( "index/genome", ".1.bt2", ".2.bt2", ".3.bt2", ".4.bt2", ".rev.1.bt2", ".rev.2.bt2", ), output: "mapped_se_gz/{sample}.bam", rule test_bowtie2_index: input: sample=["reads/{sample}.1.fastq", "reads/{sample}.2.fastq"], idx=multiext( "index/genome", ".1.bt2", ".2.bt2", ".3.bt2", ".4.bt2", ".rev.1.bt2", ".rev.2.bt2", ), output: "mapped_idx/{sample}.bam", idx="mapped_idx/{sample}.bam.bai", metrics="mapped_idx/{sample}.metrics.txt", unaligned="mapped_idx/{sample}.unaligned.sam", unpaired="mapped_idx/{sample}.unpaired.sam", # unconcordant="", # concordant="", log: "logs/bowtie2/{sample}.log", params: extra="", # optional parameters threads: 8 # Use at least two threads wrapper: "v3.0.4/bio/bowtie2/align" rule test_bowtie2_cram: input: sample=["reads/{sample}.1.fastq", "reads/{sample}.2.fastq"], idx=multiext( "index/genome", ".1.bt2", ".2.bt2", ".3.bt2", ".4.bt2", ".rev.1.bt2", ".rev.2.bt2", ), ref="genome.fasta", output: "mapped_idx/{sample}.cram", # idx="", # metrics="", # unaligned="", # unpaired="", # unconcordant="", # concordant="", log: "logs/bowtie2/{sample}.log", params: extra="", # optional parameters threads: 8 # Use at least two threads wrapper: "v3.0.4/bio/bowtie2/align" Note that input, output and log file paths can be chosen freely. When running with .. code-block:: bash snakemake --use-conda the software dependencies will be automatically deployed into an isolated environment before execution. Notes ----- * This wrapper uses an inner pipe. Make sure to use at least two threads in your Snakefile. Software dependencies --------------------- * ``bowtie2=2.5.2`` * ``samtools=1.18`` * ``snakemake-wrapper-utils=0.6.2`` Input/Output ------------ **Input:** * ``sample``: FASTQ file(s) * ``idx``: Bowtie2 indexed reference index * ``ref``: Optional path to genome sequence (FASTA) * ``ref_fai``: Optional path to reference genome sequence index (FAI) **Output:** * SAM/BAM/CRAM file. This must be the first output file in the output file list. * ``idx``: Optional path to bam index. * ``metrics``: Optional path to metrics file. * ``unaligned``: Optional path to unaligned unpaired reads. * ``unpaired``: Optional path to unpaired reads that aligned at least once. * ``unconcordant``: Optional path to pairs that didn't align concordantly. * ``concordant``: Optional path to pairs that aligned concordantly at least once. Params ------ * ``extra``: additional program arguments (except for `-x`, `-U`, `-1`, `-2`, `--interleaved`, `-b`, `--met-file`, `--un`, `--al`, `--un-conc`, `--al-conc`, `-f`, `--tab6`, `--tab5`, `-q`, or `-p/--threads`) * ``interleaved``: Input `sample` contains interleaved paired-end FASTQ/FASTA reads. `False`(default) or `True`. Authors ------- * Johannes Köster * Filipe G. Vieira * Thibault Dayris Code ---- .. code-block:: python __author__ = "Johannes Köster" __copyright__ = "Copyright 2016, Johannes Köster" __email__ = "koester@jimmy.harvard.edu" __license__ = "MIT" import os from snakemake.shell import shell from snakemake_wrapper_utils.samtools import get_samtools_opts def get_format(path: str) -> str: """ Return file format since Bowtie2 reads files that could be gzip'ed (extension: .gz) or bzip2'ed (extension: .bz2). """ if path.endswith((".gz", ".bz2")): return path.split(".")[-2].lower() return path.split(".")[-1].lower() bowtie2_threads = snakemake.threads - 1 if bowtie2_threads < 1: raise ValueError( f"This wrapper expected at least two threads, got {snakemake.threads}" ) # Setting parse_threads to false since samtools performs only # bam compression. Thus the wrapper would use *twice* the amount # of threads reserved by user otherwise. samtools_opts = get_samtools_opts(snakemake, parse_threads=False) extra = snakemake.params.get("extra", "") log = snakemake.log_fmt_shell(stdout=True, stderr=True) n = len(snakemake.input.sample) assert ( n == 1 or n == 2 ), "input->sample must have 1 (single-end) or 2 (paired-end) elements." reads = "" if n == 1: if get_format(snakemake.input.sample[0]) in ("bam", "sam"): reads = f"-b {snakemake.input.sample}" else: if snakemake.params.get("interleaved", False): reads = f"--interleaved {snakemake.input.sample}" else: reads = f"-U {snakemake.input.sample}" else: reads = "-1 {} -2 {}".format(*snakemake.input.sample) if all(get_format(sample) in ("fastq", "fq") for sample in snakemake.input.sample): extra += " -q " elif all(get_format(sample) == "tab5" for sample in snakemake.input.sample): extra += " --tab5 " elif all(get_format(sample) == "tab6" for sample in snakemake.input.sample): extra += " --tab6 " elif all( get_format(sample) in ("fa", "mfa", "fasta") for sample in snakemake.input.sample ): extra += " -f " metrics = snakemake.output.get("metrics") if metrics: extra += f" --met-file {metrics} " unaligned = snakemake.output.get("unaligned") if unaligned: extra += f" --un {unaligned} " unpaired = snakemake.output.get("unpaired") if unpaired: extra += f" --al {unpaired} " unconcordant = snakemake.output.get("unconcordant") if unconcordant: extra += f" --un-conc {unconcordant} " concordant = snakemake.output.get("concordant") if concordant: extra += f" --al-conc {concordant} " index = os.path.commonprefix(snakemake.input.idx).rstrip(".") shell( "(bowtie2" " --threads {bowtie2_threads}" " {reads} " " -x {index}" " {extra}" "| samtools view --with-header " " {samtools_opts}" " -" ") {log}" ) .. |nl| raw:: html