COOLTOOLS SADDLE
Calculate a saddle for a resolution in an .mcool file using a track
URL: https://github.com/open2c/cooltools
Example
This wrapper can be used in the following way:
rule cooltools_saddle:
input:
cooler="CN.mm9.1000kb.mcool", ## Multiresolution cooler file
track="CN_1000000.eigs.tsv", ## Track file
expected="CN_1000000.cis.expected.tsv", ## Expected file
view="mm9_view.txt", ## File with the region names and coordinates
output:
saddle="CN_{resolution,[0-9]+}.saddledump.npz",
digitized_track="CN_{resolution,[0-9]+}.digitized.tsv",
fig="CN_{resolution,[0-9]+}.saddle.pdf",
params:
## Add optional parameters
range="--qrange 0.01 0.99",
extra="",
log:
"logs/CN_{resolution}_saddle.log",
wrapper:
"v5.0.1/bio/cooltools/saddle"
# Note that in this test files are edited to remove
Note that input, output and log file paths can be chosen freely.
When running with
snakemake --use-conda
the software dependencies will be automatically deployed into an isolated environment before execution.
Software dependencies
cooltools=0.7.0
Input/Output
Input:
a multiresolution cooler file (.mcool)
track file
expected file
(optional) view, a bed-style file with region coordinates and names to use for analysis
Output:
Saves a binary .npz file with saddles and extra information about it, and a track file with digitized values. Can also save saddle plots using extra –fig argument. All output files have the same prefix, taken from the first output argument (i.e. enough to give one output argument). Can have a {resolution} wildcard that specifies the resolution for the analysis, then it doesn’t need to be specified as a parameter.
Params
range
: What range of values from the track to use. Typically used to ignore outliers. –qrange 0 1 will use all data (default) –qrange 0.01 0.99 will ignore first and last percentile –range 0 5 will use values from 0 to 5resolution
: Optional, can be instead specified as a wildcard in the outputextra
: Any additional arguments to pass
Code
__author__ = "Ilya Flyamer"
__copyright__ = "Copyright 2022, Ilya Flyamer"
__email__ = "flyamer@gmail.com"
__license__ = "MIT"
from snakemake.shell import shell
from os import path
import tempfile
## Extract arguments
view = snakemake.input.get("view", "")
if view:
view = f"--view {view}"
track = snakemake.input.get("track", "")
track_col_name = snakemake.params.get("track_col_name", "")
if track and track_col_name:
track = f"{track}::{track_col_name}"
expected = snakemake.input.get("expected", "")
range = snakemake.params.get("range", "--qrange 0 1")
extra = snakemake.params.get("extra", "")
log = snakemake.log_fmt_shell(stdout=False, stderr=True)
resolution = snakemake.params.get(
"resolution", snakemake.wildcards.get("resolution", 0)
)
if not resolution:
raise ValueError("Please specify resolution either as a wildcard or as a parameter")
fig = snakemake.output.get("fig", "")
if fig:
ext = path.splitext(fig)[1][1:]
fig = f"--fig {ext}"
with tempfile.TemporaryDirectory() as tmpdir:
shell(
"(cooltools saddle"
" {snakemake.input.cooler}::resolutions/{resolution} "
" {track} "
" {expected} "
" {view} "
" {range} "
" {fig} "
" {extra} "
" -o {tmpdir}/out)"
" {log}"
)
shell("mv {tmpdir}/out.saddledump.npz {snakemake.output.saddle}")
shell("mv {tmpdir}/out.digitized.tsv {snakemake.output.digitized_track}")
if fig:
shell("mv {tmpdir}/out.{ext} {snakemake.output.fig}")