MOFA2 PLOTTING
Explore a model of a multi-omic data set with some basics plotting.
URL: https://www.bioconductor.org/packages/release/bioc/html/MOFA2.html
Example
This wrapper can be used in the following way:
rule test_all:
input:
"results/data_overview.png",
"results/data_variance_explained.png",
"results/data_feature_weights.png",
"results/data_data_heatmap.png",
rule mofa2_plotting:
input:
"{data}.hdf5",
output:
overview="results/{data}_overview.png", # no requirements
variance_explained="results/{data}_variance_explained.png", # no requirements
feature_weights="results/{data}_feature_weights.png", # requires params: `view`, `factor`, `nfeatures`
data_heatmap="results/{data}_data_heatmap.png", # requires params: `view`, `factor`, `features`
log:
"log/{data}.log",
params:
view="view_0", # the name of the view to be plotted
factor=1, # which factor to plot
features=20, # the number of features to plot
nfeatures=10, # the number of features to highlight
wrapper:
"v9.4.0/bio/mofa2/plotting"
Note that input, output and log file paths can be chosen freely.
When running with
snakemake --use-conda
the software dependencies will be automatically deployed into an isolated environment before execution.
Notes
Each of the plots will be output as a separate PNG file. Choose the plots by setting the output accordingly.
Make sure to set the params required for the desired output.
Software dependencies
bioconductor-mofa2=1.20.2
Input/Output
Input:
An HDF5-file with the trained model.
Output:
overview: Optional. PNG file. Shows the number of views (rows) and groups (columns), what their corresponding dimensionalities are, and how much missing data they have. Doesn’t require any params to be set.variance_explained: Optional. PNG file. Shows how much of the variance is explained by each factor in each data modality. Includes all factors. Doesn’t require any params to be set.feature_weights: Optional. PNG file. The weights provide a score for how strong each feature relates to each factor. Features with no association with the factor have values close to zero, while features with strong association with the factor have large absolute values. The sign of the weight indicates the direction of the effect: a positive weight indicates that the feature has higher levels in the cells with positive factor values, and vice versa. Requires the following params to be set: view, factor, nfeatures.data_heatmap: Optional. PNG file. Heatmap of observations to allow for an observation of the coordinated heterogeneity that MOFA captures in the original data. Top features are selected by their weight in the selected factor. Samples are ordered according to their corresponding factor value. Requires the following params to be set: view, factor, features.
Params
view: String that specifies the name of the view from the data set that will be plotted.factor: Integer that specifies the index of the factor that will be plotted. Required for feature_weights, data_heatmap.features: Integer that specifies the number of features to be plotted. Required for data_heatmap plot.nfeatures: Integer that specifies the number of features to be highlighted. Required for feature_weights plot.
Code
#!/bin/R
# load libraries
library(MOFA2)
library(ggplot2)
# if log file is provided, write log to that file
if (length(snakemake@log) > 0) {
log <- file(snakemake@log[[1]], open = "wt")
sink(log)
sink(log, type = "message")
}
# cast input and output path as character to avoid errors
input_path <- as.character(snakemake@input[[1]])
# load a MOFA2 model from an hdf5 file
model <- load_model(input_path)
# setting the variables customisable with params
if ("view" %in% names(snakemake@params)) {
view <- snakemake@params[["view"]]
} else {
view <- "view_0"
}
if ("factor" %in% names(snakemake@params)) {
factor <- snakemake@params[["factor"]]
} else {
factor <- 1
}
if ("features" %in% names(snakemake@params)) {
features <- snakemake@params[["features"]]
} else {
features <- 10
}
if ("nfeatures" %in% names(snakemake@params)) {
nfeatures <- snakemake@params[["nfeatures"]]
} else {
nfeatures <- 10
}
# creating the requested plots
# overview plot
if ("overview" %in% names(snakemake@output)) {
overview_path <- as.character(snakemake@output[["overview"]])
p <- plot_data_overview(model)
# write plot to file
ggsave(plot = p, filename = overview_path)
}
# variance_explained plot
if ("variance_explained" %in% names(snakemake@output)) {
variance_explained_path <- as.character(snakemake@output[["variance_explained"]])
p <- plot_variance_explained(model, x="view", y="factor")
# write plot to file
ggsave(plot = p, filename = variance_explained_path)
}
# feature weights plot
if ("feature_weights" %in% names(snakemake@output)) {
feature_weights_path <- as.character(snakemake@output[["feature_weights"]])
p <- plot_weights(model,
view = view, # which view to plot
factor = factor, # which factor to plot
nfeatures = nfeatures, # number of features to highlight
scale = TRUE, # scale weights from -1 to 1
abs = FALSE # take the absolute value?
)
# write plot to file
ggsave(plot = p, filename = feature_weights_path)
}
# covariation patterns
## covariation patterns heatmap
if ("data_heatmap" %in% names(snakemake@output)) {
data_heatmap_path <- as.character(snakemake@output[["data_heatmap"]])
p <- plot_data_heatmap(model,
view = view, # which view to plot
factor = factor, # which factor to plot
features = features, # how many features to plot
cluster_rows = TRUE, cluster_cols = FALSE,
show_rownames = TRUE, show_colnames = FALSE
)
# write plot to file
ggsave(plot = p, filename = data_heatmap_path)
}
# checks whether Rplots was created and removes it
if (file.exists("Rplots.pdf")) {
file.remove("Rplots.pdf")
}