--- title: "An introduction to ELAplus: Energy Landscape Analysis" author: "Sotaro Takano, Kenta Suzuki, Hiroaki Fujita" output: rmarkdown::html_vignette: toc: true number_sections: true vignette: > %\VignetteIndexEntry{An introduction to ELAplus: Energy Landscape Analysis(eval=FALSE)} %\VignetteEncoding{UTF-8} %\VignetteEngine{knitr::rmarkdown} editor_options: markdown: wrap: 72 --- ```{=html} ``` ```{r setup, include=FALSE} library(knitr) opts_chunk$set( fig.align = "center", out.extra = 'style="max-width:100%; overflow-x:auto; white-space: nowrap;"', tidy = FALSE ) knitr::opts_chunk$set( collapse = TRUE, comment = "#>", message = FALSE, warning = FALSE ) ``` # Overview ELAplus provides tools for Energy Landscape Analysis (ELA), a framework for analyzing dynamical systems represented as weighted networks. The package is designed primarily for the analysis of ecological community composition data, but the framework is applicable to other multistable systems. The typical workflow consists of: **- Fitting community composition data to a (possibly extended) pairwise maximum entropy model** **- Constructing an energy landscape from the fitted model** **- Identifying stable states and transition structures** **- Visualizing the resulting landscape and interactions** The theoretical background is described in [Suzuki et al. (2021)](https://esajournals.onlinelibrary.wiley.com/doi/abs/10.1002/ecm.1469). # Example: Basic Energy Landscape Analysis This vignette demonstrates a standard ELA workflow using test dataset. ## Input dataset - `baseabtable`: **samples** - **species** abundance table (data.frame or matrix). - Rows correspond to sampling units (sites, time points, subjects, etc.) - Columns correspond to species (or OTUs/ASVs/features/taxa). - `basemetadata` (optional): a **samples** - **factors** metadata table (data.frame or matrix). - Rows correspond to sampling units (same as those in `baseabtable`) - Columns correspond to environmental parameters (e.g., temperature, pH, etc.). ```{r} # load package library(ELAplus) # load test data set # species abundance table data("baseabtable") head(baseabtable) # metadata (environmental factors) data("basemetadata") head(basemetadata) ``` ## (Optional) Data generation using a Gibbs sampling (heat-bath algorithm) The simulated community composition data are also available. In this example, the data comprising `24` species and `512` samples were generated through `100` iterations of Gibbs sampling (heat-bath algorithm). Both implicit/explicit environmental factors (`h.act` and `g.act`) are considered. The number of explicit environmental factors is specified by `ne` (`2` factors in this example). `j.act` is the interaction matrix. ```{r} # load package library(ELAplus) # Generate random parameters hb_params <- hb.paramgen(24, ne = 2) # the number of environmental factors h.act <- hb_params[[1]] j.act <- hb_params[[2]] g.act <- hb_params[[3]] # Simulate community composition data rand_data <- HeatBath(100, 512, h.act, j.act, g = g.act) rand_mat <- rand_data[[1]] rand_enmat <- rand_data[[2]] ``` ## Formatting the dataset ELAplus expects the community data in a standardized representation. `Formatting()` converts raw input tables into aligned occurrence (binary) matrix and (optionally) environmental factor matrix. ### Inputs - `baseabtable`: **samples** - **species** abundance table (data.frame or matrix). - `basemetadata` (optional): a **samples** - **factors** metadata table (data.frame or matrix). - This is used when you want to include environmental parameters (e.g., temperature, pH, treatment group). - If you do not have environmental data, pass NULL (or an empty object). ```{r} formatted <- Formatting(baseabtable = baseabtable, basemetadata = basemetadata) ``` ### Key preprocessing options #### Normalization (normalize) - If `normalize = 1`, the abundances within each sample are normalized so that each row sums to 1. This is useful when: - total sequencing depth differs strongly between samples etc... - If `normalize = 0`, the original scale is preserved. #### Occurrence binarization (the “0/1 threshold”) - ELA operates on binary presence/absence states (each species is present or absent). - `Formatting()` creates the occurrence matrix by applying the first element of `parameters`: - `parameters[1]` (“0/1 threshold”): a numeric cutoff used to define presence. - Example: `parameters[1] = 0.05` means “abundance \> 0.05 is present; otherwise absent”. #### Species filtering by prevalence (min/max occurrence thresholds) - Species that are too rare or almost always present typically add little information or inflate the results in construction of energy landscape. - `Formatting()` therefore filters species based on prevalence across samples using: - `parameters[2]` (“min occurrence threshold”) - `parameters[3]` (“max occurrence threshold”) - These are interpreted as proportions of samples in which the species is present **after binarization**. - Example: - `parameters = c(0.01, 0.01, 0.99)` means: - present(1) if abundance \> 0.01, - keep species whose presence frequency is \> 0.01 and \< 0.99. ```{r, eval=FALSE} formatted <- Formatting(baseabtable = baseabtable, basemetadata = basemetadata, normalize = 1, parameters = c(0.01, 0.01, 0.99)) #> Processed 256 samples. #> Relative abundance threshold = 0.01 #> Occurrence threshold (lower) = 0.01 #> Occurrence threshold (upper) = 0.99 #> Selected 16 out of 16 species. ocvecs <- formatted[[1]] # community composition binary matrix abvecs <- formatted[[2]] # community abundance matrix (not binarized) envecs <- formatted[[3]] # environmental variables (optional) ``` ```{r, echo=FALSE, out.width="80%"} knitr::include_graphics("figures/ParameterMatrix.png") ``` ## Parameter fitting ### Extended maximum pairwise entropy model The energy $E(\sigma^{(k)},\varepsilon)$ of each community composition $\sigma^{(k)}$ is assigned by imposing the potential structure to the state space by introducing the extended pairwise maximum entropy model. Briefly, the energy is determined by three parameters. For details, please see [Appendix]. ```{r, echo=FALSE, out.width="95%"} knitr::include_graphics("figures/EPMEM_abstract2.png") ``` - `h`: Implicit (unobserved) environmental parameter for each species - `J`: Species-species interaction matrix - `g`: Explicit environmental parameters for each species ### Parameter fitting algorithm Fitting the parameters of the pairwise maximum entropy model is performed using a stochastic approximation algorithm. We approximate the intractable model expectation in the likelihood gradient with a persistent (warm-started) Gibbs/heat-bath chain advanced by one sweep per parameter iteration, corresponding to persistent contrastive divergence with one sweep (PCD-1), also known as stochastic maximum likelihood (SML). It repeatedly simulates data under the current parameters and then updates `h`, `J`, and `g` to reduce the discrepancies between observed vs simulated metrics. Two observed co-occurrence statistics are used for the fitting. 1. Species–species co-occurrence: $y = ocvecs^{T}ocvecs$ 2. Environment–species co-occurrence: $y^{env} = envecs^{T}ocvecs$ Then, the algorithm repeatedly updates the parameters by comparing - observed vs simulated species–species co-occurrence $y_{diff} = y_{observed} - y_{simulated}$ - observed vs simulated environment–species co-occurrence $y^{env}_{diff} = y^{env}_{observed} - y^{env}_{simulated}$ Each parameter independently updated by those metrics: $$ \Delta{h} \propto diag(y_{diff}) $$ $$ \Delta{J} \propto y_{diff} $$ $$ \Delta{g} \propto y^{env}_{diff} $$ ## Hyperparameter search for fitting For simplicity, the usage of ELA without explicit environmental parameters (`envecs`) is first explained. Fitting the pairwise maximum entropy model with `runSA()` depends on several optimization configurations, and the best settings can vary across datasets (number of species, sample size, sparsity). ELAplus provides a dedicated grid-search routine to choose hyperparameters empirically. `Findbp()` performs a grid search over three key hyperparameters: \-`lmd` : **a sparsity penalty** to encourage sparse matrix. Larger values typically lead to fewer nonzero parameters. \-`we` : **AdamW weight decay**, a form of L2 regularization applied during optimization. \-`maxlrs` : **maximum learning rate** in fitting algorithm (SML). ```{r, eval=FALSE} bpresult <- Findbp(ocvecs, rep = 2, threads = 2, cvmode="10fold",nrepeat = 10, we=c(0.001),totalit=4000, lmd=c(1e-05,1e-03,0.01),maxlrs = c(0.005), tol=0.03, runadamW=TRUE,sparse=TRUE,fastfitting=TRUE) #Start hyper-parameter search: #There are 300 tasks #Finished 300 tasks #SA: elapsed time 15.08 sec bp <- bpresult[[1]] # typically best bpresult bpm <- as.numeric(unlist(strsplit(names(bp)[1], split = "_"))) allresults <- bpresult[[2]] ``` ### What Findbp does **1. Splits samples into training and test subsets** - The data are split using cross-validation based on cvmode. - For "10fold" or "5fold", samples are randomly assigned into k folds (repeated `nrepeat` times), and each fold is used as the test set while the remaining folds are used for training. - For "loo" (leave-one-out), each sample is used once as the test set while the rest form the training set. **2. Fits the model on the training subset for each grid point** - For each combination of candidate hyperparameters, the function calls `runSA()`. **3. Evaluates predictive performance on held-out samples** - Each fitted model is scored by its predictive error on the test subset (lower is better). - The best-performing settings are returned. - If `fastfitting=TRUE`, the function returns candidates with the lowest iterations within `tol` of the best observed error. ### Fitting results can be visualized by `plotSAtest`. ```{r plotSAtest, eval=FALSE} plotSAtest(allresults) # "id" corresponds to hyperparameter: lmd_we_maxlr ``` ```{r, echo=FALSE, out.width="95%"} knitr::include_graphics("figures/plotSAtest.png") ``` ### Fitting the extended maximum pairwise entropy model The best parameters found by `Findbp()` are used for real analyses by `runSA()`. ```{r, eval=FALSE} sa <- runSA( ocvecs, enmat = NULL, rep = 32, getall = FALSE, lambda = bpm[[1]], we = bpm[[2]], maxlr = bpm[[3]], totalit = bpm[[4]] ) #> Start parameter fitting: #> .SA: elapsed time 0.13 sec ``` ## Constructing the energy landscape After fitting the (extended) pairwise maximum entropy model with `runSA()`, the next step is to construct energy landscape over community compositions using the fitted parameters. In ELAplus, this is performed by `ELA()`, which identifies: **1. Stable states** Here, the base of the basin (with minimal energy) is regarded as a stable state, which is distinct from the equilibrium point of the dynamical system or the attractor. For details, see [Suzuki et al. (2021)](https://esajournals.onlinelibrary.wiley.com/doi/abs/10.1002/ecm.1469), **2. Tipping points (energy barriers connecting pairs of basins/stable states).** A connecting pair of basins have basin boundary and the point with the lowest energy barrier is regarded as tipping point here. The fitted model defines an energy value for any community composition (typically a binary presence/absence vector of community members). ```{r, eval=FALSE} ela <- ELA( sa, env = NULL, threads = 1, SS.itr = 20000, FindingTip.itr = 10000, reporting = TRUE ) #> Start ELA: #> 7 stable states were found. #> Checking 21 tipping points. #> converting... #> ELA: elapsed time 2.19 sec ``` ## ELA output After stable states and tipping points are computed, `ELA()` summarizes and standardizes the outputs. This conversion step produces a compact representation of the energy landscape structure: - [[1]] Stable state IDs (64-base index encoded from community composition array by `bin2id()`) - [[2]] Stable state energies - [[3]] Tipping point ID (for stable-state pairs) - [[4]] Tipping energies (corresponding barrier energies) The output format is compatible to downstream steps such as pruning (ELPruning) and visualization (e.g., disconnectivity graphs). ```{r, eval=FALSE} ela[[1]] # stable state IDs #> [1] "09x" "01t" "EWB" "EY8" "1uV" "1mL" "17s" ela[[2]] # stable state energies #> [1] -11.099418 -10.297931 -9.544050 -8.961634 -8.005040 -6.440385 -6.269140 ``` ### What ELA does 1. Identifying stable states by repeated steepest descent (`SSestimate`) - The number of repetitions is controlled by `SS.itr` (default 20000). - More iterations increase the chance of discovering all major stable states, especially when the number of species is large, landscape has many stable states, etc... 2. Finding tipping points between stable states - Once a set of stable states is obtained, `ELA()` evaluates transitions between them by searching for tipping points between all pairs of stable states. - The number of iterations is controlled by `FindingTip.itr`. As with `SS.itr`, increasing `FindingTip.itr` improves robustness at the cost of runtime. This step can be computationally heavy because it is performed across stable-state pairs. ## Pruning stable states The energy landscape constructed by `ELA()` first identified stable states as much as possible. Some of these stable states correspond to very shallow basins. Such stable states are often sensitive to noise and does not largely explain dominant multistability of the system. `ELPruning()` addresses this issue by iteratively pruning shallow basins and merging them into deeper, neighboring basins. ```{r, eval=FALSE} elap <- ELPruning(ela, th=0.1, type="relative") #> Start pruning: #> *... #> ELPruning: elapsed time 0.05 sec ``` ### Output and interpretation `ELPruning()` returns a list with two elements: [[1]] Pruned ELA object (same format as the output of `ELA()`): - [[1]] updated stable state IDs, - [[2]] updated stable state energies, - [[3]] updated tipping point IDs, - [[4]] updated tipping energies. ```{r, eval=FALSE} elap[[1]][[1]] # updated stable state IDs #> [1] "09x" "EWB" "EY8" "1uV" elap[[1]][[2]] # updated stable state energies #> [1] -11.099418 -9.544050 -8.961634 -8.005040 ``` [[2]] Stable-state mapping table - ss.before.pruning: stable state IDs from the original (unpruned) landscape - ss.after.pruning: corresponding stable state IDs after pruning ```{r, eval=FALSE} elap[[2]] #> ss.before.pruning ss.after.pruning #> 1 09x 09x #> 2 01t 09x #> 3 EWB EWB #> 4 EY8 EY8 #> 5 1uV 1uV #> 6 1mL 09x #> 7 17s 09x ``` ### Basin depth and pruning criterion For each stable state, a basin depth is defined as the energy difference between the energy of the stable state itself and the lowest tipping point energy connecting that stable state to any other stable state ($\Delta E$). The basin depth measures how “robust” a stable state is against transitions: deeper basins correspond to more robust, while shallow basins correspond to more unstable states. In `ELPruning()`, there are two methods `relative` or `bootstrap`. ```{r, echo=FALSE, out.width="95%"} knitr::include_graphics("figures/deltaE_scheme.png") ``` #### Relative depth pruning If `type="relative"` is set, `th` (relative depth threshold) specifies the minimum acceptable energy barrier relative to the deepest basin barrier. $$ \Delta E_{\mathrm{min}} < th * \Delta E_{\mathrm{max}} $$ For example, `th = 0.2` means that basins shallower than 20% of the deepest basin ($\Delta E_{\mathrm{max}}$) are pruned. The results of pruning is usually sensitive to this value. ```{r, eval=FALSE} elap <- ELPruning(ela, th=0.2, type="relative") #> Start pruning: #> *... #> ELPruning: elapsed time 0.05 sec elap[[1]][[1]] # updated stable state IDs #> [1] "09x" "EWB" "1uV" ``` #### bootstrap samples pruning Alternatively, users can perform pruning of stable states by calculating vulnerabilities in bootstrap samples in each stable state. This method is explained in the next section:[Pruning with bootstrap resampling]. ## Visualization ### Disconnectivity graph A disconnectivity graph is a compact visualization of the global structure of an energy landscape. - It summarizes how stable states are separated by energy barriers (tipping points). - The vertical axis represents energy - Branches illustrate how stable states are connected and separated by tipping points. ```{r showDG, eval=FALSE} # Before pruning showDG(ela, ocvecs, "example") ``` ```{r, echo=FALSE, out.width="95%"} knitr::include_graphics("figures/DGexample.png") ``` ```{r showDG_pruned, eval=FALSE} # After pruning showDG(elap[[1]], ocvecs, "example") ``` ```{r, echo=FALSE, out.width="95%"} knitr::include_graphics("figures/DGexample_pruned.png") ``` #### How to interpret the graph - Lower energies correspond to more stable configurations. - Branching structure indicates which basins are “close” in terms of energy barriers. - Barrier height as a transition difficulty proxy. #### Optional checking bootstrap resampling The reliability of stable states can be also examined by the bootstrap resampling of training data sets (the occurrence matrix). The function, `bootstrap_SA()`, performs parameter fitting with bootstrap resampling of the data sets. `bootstrap_ELA()` uses the results of `bootstrap_SA()` (i.e., estimated parameters: h, J) for computing the variability of energy gaps between tipping points and stable states arising from uncertainty due to finite sample size. If the estimated stable state is statistically reliable, the estimated $\Delta E$ is expected to be consistently smaller than 0 across bootstrap resamples of the training data. Therefore, the probability of bootstrap samples with $\Delta E < 0$ reflects the statistical significance of the estimated stable state. Here, this quantity is referred to as a "bootstrap-approximated p-value" and is used to assess statistical reliability. As an example, the distribution of $\Delta E$ between "01t" and "09/" is shown as a histogram. More than 25% of bootstrap samples yield $\Delta E > 0$ (i.e., p-val \> 0.25), and this case is regarded as statistically not reliable. ```{r, echo=FALSE, out.width="95%"} knitr::include_graphics("figures/deltaE_hist.png") ``` If `bootstrap_ene` is specified in `showDG()`, the function displays an approximate p-value in each stable state. This option is helpful when you want to see uncertainty in estimated energies in ELA. ```{r showDG_pvalue, eval=FALSE} bs_params <- bootstrap_SA(ocvecs, rep = 16, threads=8, bootstrap.it = 1000, lambda = bpm[[1]],we = bpm[[2]], maxlr = bpm[[3]], totalit = bpm[[4]]) bootstrap <- bootstrap_ELA(bs_params,ocvecs,ela=ela,threads = 1) ela_rep <- bootstrap[[1]] bootstrap_ene <- bootstrap[[2]] showDG(ela_rep,ocvecs,bootstrap_ene = bootstrap_ene, min_pval = 0.01, scale_labels = c(0.05,0.1,0.2,0.5)) ``` ```{r, echo=FALSE, out.width="95%"} knitr::include_graphics("figures/DGexample_pval.png") ``` #### Pruning with bootstrap resampling The pruning stable states is also possible using the bootstrap results. The `ELPruning()` with `type="bootstrap"`used the distribution of $\Delta E$ in bootstrap samples and evaluated whether the predicted stable states are statistically reliable. If the p-value is above threshold (e.g., 0.05), the function prunes such unreliable and shallow basins (stable states) and merges them into deeper, neighboring basins. The pruned graph is follows. ```{r showDG_IQR_pruned, eval=FALSE} elap_rep <- ELPruning(ela, type = "bootstrap",bootstrap_ene=bootstrap_ene,lower_qr = 0.25) showDG(ela_rep,ocvecs,bootstrap_ene = bootstrap_ene, min_pval = 0.01, scale_labels = c(0.05,0.1,0.2,0.5)) ``` ```{r, echo=FALSE, out.width="95%"} knitr::include_graphics("figures/DGexample_pval_pruned.png") ``` If `getGraohObj=TRUE`, `showDG()` function returns a data frame for plotting a disconnectivity graph. In the following example, disconnectivity graph is drawn using `ggplot()`. Tipping points and stable states are shown as triangles and circles, respectively, and colored by energies. ```{r showDG_manual, eval=FALSE} grobj_to_plot <- showDG(ela_rep,ocvecs,bootstrap_ene = bootstrap_ene,getGraphObj = TRUE) grobj_point <- grobj_to_plot[grobj_to_plot$point_str != "",] g <- ggplot( grobj_to_plot, aes(x=.data$nodes2xposi, y=.data$energy, label=.data$point_str)) + xlab("") + geom_text(hjust=0.75, vjust=2, aes(fontface=2)) + geom_path() + geom_point( data = grobj_point, mapping = aes( x = .data$nodes2xposi, y = .data$energy, color = .data$energy, shape = .data$type, ),size = 4 ) + scale_color_viridis_c(option = "plasma",alpha=0.8, direction=-1,na.value = "grey") + coord_cartesian(xlim=c(0.75,7.5), ylim=c(-5.8,-11.5)) plot(g) ``` ```{r, echo=FALSE, out.width="95%"} knitr::include_graphics("figures/DG_manual.png") ``` ### Species interaction graph `showIntrGraph()` visualizes the pairwise interaction structure inferred by the fitted model in a 2D “interaction space”. It extracts estimated parameters from the resulting interaction matrix (jest). - Each species is plotted as a point at its PCoA coordinates. - Links with value \> `th` are classified as positive, while links with value \< `-th` are classified as negative. The plot draws these two link types separately and colors them differently (positive = red, negative = blue). ```{r ShowIntrGraph, eval=FALSE} showIntrGraph( elap[[1]], sa, th = 0.01, annot_adj = c(0.75, 1.00) ) ``` ```{r, echo=FALSE, out.width="95%"} knitr::include_graphics("figures/IntrGraph.png") ``` # Environmental gradient analysis The stability structure of a community is not fixed, but changes continuously along an environmental gradient (e.g., temperature, nutrient availability, disturbance intensity). `GradELA()` is designed to capture this dependence by estimating energy landscapes repeatedly across a controlled range of an environmental factor and summarizing how stable states and transition barriers change. When environmental variables (`envecs`) are included, ELAplus supports **Gradient Energy Landscape Analysis** via `GradELA()`. The parameter fitting process is same as that for normal ELA except for specifying `enmat`. ```{r, eval=FALSE} bpresult <- Findbp(ocvecs, enmat = envecs, rep = 2, threads = 2, cvmode="10fold",nrepeat = 10, we=c(0.001),totalit=4000, lmd=c(1e-05,1e-03,0.01),maxlrs = c(0.005), tol=0.03, runadamW=TRUE,sparse=TRUE,fastfitting=TRUE) bp <- bpresult[[1]] bpm <- as.numeric(unlist(strsplit(names(bp)[1], split = "_"))) sa <- runSA( ocvecs, enmat = envecs, rep = 16, getall = FALSE, lambda = bpm[[1]], we = bpm[[2]], maxlr = bpm[[3]], totalit = bpm[[4]] ) ``` ## Concept of GradELA `GradELA()` proceeds in three conceptual stages: 1. Specification of an environmental gradient - One environmental factor `eid` is varied over a specified `range` and `steps`. - `eid` is environmental factor for changing (one column in the environmental matrix `enmat`). - `range` is the minimum and maximum values of the environmental factor. - `steps` is how many discrete points are evaluated. 2. Energy landscape construction at each step - For each environmental value along the gradient,`ELA()` and `ELPruning()` are performed. - When varying one environmental factor, all remaining factors must be assigned fixed values. - This is controlled by the `env` (it corresponds to a row in `enmat`) as follows: - If `env` is provided, it is used as the reference vector for all environmental factors except `eid`. - If `env` is not provided, the mean vector of `enmat` is used automatically. ```{r, eval=FALSE} # This process takes few minutes with a small number of threads, and here pre-computed results are shown. gela <- GradELA( sa = sa, eid = "factor.1", enmat = envecs, env = NULL, range = c(-1, 1), steps = 32, th = 0.05, threads = 8, pruning_type = "relative", bs_params = bs_params, ocmat = ocvecs) ``` Users can use bootstrap samples in ELPruning process by `pruning_type="bootstrap"`. Other options are equivalent to `ELPruning()`. ```{r, eval=FALSE} # This process takes few minutes with a small number of threads, and here pre-computed results are shown. bs_params <- bootstrap_SA(ocvecs, enmat = envecs,rep = 16, threads=8,bootstrap.it = 1000, lambda = bpm[[1]],we = bpm[[2]], maxlr = bpm[[3]], totalit = bpm[[4]]) gela <- GradELA( sa = sa, eid = "factor.1", enmat = envecs, env = NULL, range = c(-1, 1), steps = 32, lower_qr = 0.25, threads = 8, pruning_type = "bootstrap", bs_params = bs_params, ocmat = ocvecs) ``` ## Output `GradELA()` returns a list with three elements: **1. Pruned ELA results for each step** - A list of energy landscape objects (same format as `ELA()` output, after pruning). **2. Environmental value sequence** - The numeric vector of environmental values used for the target factor (`eid`). **3. Reference environmental vector** - The full environmental vector used as a baseline for non-varied factors. ```{r, eval=FALSE} gela[[1]][[1]][[1]][[1]] # stable state IDs at 1st environmental parameter # [1] "0Ht" "EWB" "1uV" gela[[1]][[20]][[1]][[1]] # stable state IDs at 20th environmental parameter # [1] "09x" "EWB" ``` ## Visualization of gradient ELA results The resulting object can then be plotted using `showSSD()`, which visualize the regime shift: how the number, identity, and energy of stable states change along the environmental gradient. - The plot is drawn by overlaying multiple line traces, one per stable state. - x-axis corresponds to the change of environmental parameters and - y-axis corresponds to the energy of stable states - Changes in community state energy across the gradient reflect how resilient they are under changing conditions. - A plot discontinuity indicates a stable state that doesn't emerge under these conditions (or is pruned). ```{r showSSD, eval=FALSE, fig.width=8, fig.height=6, message=FALSE} # GradELA result (ELPruning with type="relative"). showSSD(gela) ``` ```{r, echo=FALSE, out.width="95%"} knitr::include_graphics("figures/SSD.png") ``` ```{r showSSD_bootstrap, eval=FALSE, fig.width=8, fig.height=6, message=FALSE} # GradELA result (ELPruning with type="bootstrap"). showSSD(gela) ``` ```{r, echo=FALSE, out.width="95%"} knitr::include_graphics("figures/SSD_pruned.png") ``` If users want to modify the figure configuration of the diagram, `showSSD_ggplot()` function returns a data frame for plotting. Then users can the configuration manually using `ggplot()`. ```{r showSSD_ggplot, eval=FALSE, fig.width=8, fig.height=6, message=FALSE} ssd_df <- showSSD_ggplot(gela,plot = FALSE,getSSDobj = TRUE) g <- ggplot(ssd_df, aes(x = env, y = Energy, color = SS_ID)) + geom_point(shape = 19) + geom_line(aes(group = SS_ID)) ``` ```{r, echo=FALSE, out.width="95%"} knitr::include_graphics("figures/SSD_ggplot.png") ``` ## 3D plot of gradient ELA results `showGELA3D()` visualizes a smoothed energy landscape surface produced by `GELAObj()` and overlays stable-state trajectories. - **Z axis (Energy):** model-predicted energy evaluated under the fitted parameters at each environmental parameter value. - **Y axis (y):** the environmental gradient value used in `GradELA()`. - **X axis (x):** stable states. Positions of stable states are determined by their nearest-neighbor relationships (in Hamming distance). ```{r GELA3D-demo, eval=FALSE, fig.width=8, fig.height=6, message=FALSE} # This process takes few minutes, and here pre-computed result is drawn. gelaobj <- GELAObj(gela, sa=sa,threads=2) showGELA3D(gelaobj,theta = 30,phi = 30) ``` ```{r, echo=FALSE, out.width="95%"} knitr::include_graphics("figures/GELA3D.png") ``` # Appendix **Overview of extended maximum pairwise entropy model** The model determines the probability of the occurrence of community composition $\sigma^{(k)}$ in an environmental condition $\varepsilon = (\varepsilon_{1}, \varepsilon_{2}, ..., \varepsilon_{M})$, environmental factors such as pH or temperature. $$ E(\sigma^{(k)},\varepsilon) = - \sum_{i=1}^{S} h_i \sigma_i^{(k)} - \sum_{j=1}^{S} \sum_{i=1}^{M} g_{ij} \varepsilon_i^{(k)} \sigma_j^{(k)} - \sum_{i=1}^{S} \sum_{j=1,i \neq j}^{S} J_{ij} \sigma_i^{(k)} \sigma_j^{(k)} $$ where: - $S$ is the number of species,\ - $M$ is the number of environmental variables,\ - $h_i$ is the implicit environmental variable for species $i$,\ - $g_{ij}$ is the explicit effect of environmental variable $i$ on species $j$,\ - $J_{ij}$ is the interaction strength between species $i$ and $j$,\ - $\sigma_i^{(k)}$ is the state (occurrence) of species $i$ at sample $k$,\ - $\varepsilon_i^{(k)}$ is the value of environmental variable $i$ at sample $k$.\ - $J_{ij}$ is the interaction strength between species $i$ and $j$,