.. _`bio/nanosim-h`: NANOSIM-H ========= .. image:: https://img.shields.io/github/issues-pr/snakemake/snakemake-wrappers/bio/nanosim-h?label=version%20update%20pull%20requests :target: https://github.com/snakemake/snakemake-wrappers/pulls?q=is%3Apr+is%3Aopen+label%3Abio/nanosim-h NanoSim-H is a simulator of Oxford Nanopore reads that captures the technology-specific features of ONT data, and allows for adjustments upon improvement of Nanopore sequencing technology. Example ------- This wrapper can be used in the following way: .. code-block:: python rule nanosimh: input: "{sample}.fa" output: reads = "{sample}.simulated.fa", log = "{sample}.simulated.log", errors = "{sample}.simulated.errors.txt" params: extra = "", num_reads = 10, perfect_reads = True, min_read_len = 10, log: "logs/nanosim-h/test/{sample}.log" wrapper: "v3.0.1/bio/nanosim-h" Note that input, output and log file paths can be chosen freely. When running with .. code-block:: bash snakemake --use-conda the software dependencies will be automatically deployed into an isolated environment before execution. Software dependencies --------------------- * ``nanosim-h=1.1.0.4`` Authors ------- * Michael Hall Code ---- .. code-block:: python """Snakemake wrapper for NanoSim-H.""" __author__ = "Michael Hall" __copyright__ = "Copyright 2019, Michael Hall" __email__ = "mbhall88@gmail.com" __license__ = "MIT" from snakemake.shell import shell def is_header(query): return query.startswith(">") def get_length_of_longest_sequence(fh): current_length = 0 all_lengths = [] for line in fh: if not is_header(line): current_length += len(line.rstrip()) else: all_lengths.append(current_length) current_length = 0 all_lengths.append(current_length) return max(all_lengths) # Placeholder for optional parameters extra = snakemake.params.get("extra", "") prefix = snakemake.params.get("prefix", snakemake.output.reads.rpartition(".")[0]) num_reads = snakemake.params.get("num_reads", 10000) profile = snakemake.params.get("profile", "ecoli_R9_2D") perfect_reads = snakemake.params.get("perfect_reads", False) min_read_len = snakemake.params.get("min_read_len", 50) max_read_len = snakemake.params.get("max_read_len", 0) # need to do this as the default read length of infinity can cause nanosim-h to # hang if the reference is short if max_read_len == 0: with open(snakemake.input[0]) as fh: max_read_len = get_length_of_longest_sequence(fh) perfect_reads_flag = "--perfect " if perfect_reads else "" # Formats the log redrection string log = snakemake.log_fmt_shell(stdout=True, stderr=True) # Executed shell command shell( "nanosim-h {extra} " "{perfect_reads_flag} " "--max-len {max_read_len} " "--min-len {min_read_len} " "--profile {profile} " "--number {num_reads} " "--out-pref {prefix} " "{snakemake.input} {log}" ) .. |nl| raw:: html