--- title: "Meta-analysis with the R package confMeta" author: "Saverio Fontana, Felix Hofmann, Samuel Pawel, Leonhard Held" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Meta-analysis with the R package confMeta} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.width = 7, fig.height = 5 ) ``` In a standard meta-analysis, the goal is to combine the evidence from multiple independent studies. The main focus of the `confMeta` package is to perform meta-analysis using the $p$-value function. This is a curve built by evaluating the $p$-value across a sequence of possible effect sizes (the null hypothesis values, $\mu$). ## 1. Creating a `confMeta` object ### Simulating individual studies Consider the hypothetical scenario where we have *n = 5* individual studies that should be combined into a single confidence interval using a confidence level of *1 - alpha = 0.95*. We can simulate these by running the following code ```{r, message=FALSE, warning=FALSE} library(confMeta) library(ggplot2) set.seed(42) n <- 5 conf_level <- 0.95 estimates <- rnorm(n, mean = 0.5, sd = 0.2) SEs <- rgamma(n, shape = 5, rate = 20) study_names <- c("Study A", "Study B", "Study C", "Study D", "Study E") ``` Our goal is to combine them. However, this requires the specification of a $p$-value function, i.e. a method, that takes the individual studies (argument `estimates`), their standard errors (argument `SEs`), and the mean under the null-hypothesis (argument `mu`) as input and returns the corresponding $p$-value at the specified mean value. The core function of the package is `confMeta()`. You provide your data and specify which $p$-value combination function you want to use. The possible choices provided by the package are: * Harmonic mean (Function: `p_hmean`) * Wilkinson (Function: `p_wilkinson`) * Pearson (Function: `p_pearson`) * Edgington (Function: `p_edgington`) * Weighted Edgington (Function: `p_edgington_w`) * Fisher (Function: `p_fisher`) * Tippett (Function: `p_tippett`) * Stouffer (Function: `p_stouffer`) The documentation for this function can be inspected by running ```{r, eval = FALSE} ?confMeta ?p_value_functions ``` For this example, let's use **Edgington's method**, which is based on the sum of individual $p$-values. Thus, we can create the `confMeta` object as follows ```{r} cm_edgington <- confMeta( estimates = estimates, SEs = SEs, study_names = study_names, conf_level = conf_level, fun = p_edgington, fun_name = "Edgington" ) ``` As the variable `cm_edgington` now contains the `confMeta` object, we can inspect it by running the following code ```{r} # Use the specific print argument print(cm_edgington) # See what elements it has names(cm_edgington) # Check out the combined confidence interval(s) cm_edgington$joint_cis ``` When created, the `confMeta`object automatically stores several information: * **The combined confidence interval** inverting your chosen $p$-value function (identifying the range of $\mu$ values where the $p$-value exceeds $\alpha$). * **Individual CIs** computing the Wald-type confidence intervals for each individual study. * **Reference Comparisons** fitting multiple standard meta-analytic models to your data for comparison. These include fixed-effect, random-effects, Hartung-Knapp and Henmi-Copas. * **Heterogeneity Statistics** providing estimates of Cochran's $Q$, $I^2$, and $\tau^2$. ## Using Combination Functions Directly Note that for specific purposes, like testing a given point hypothesis (for example, testing if the global effect $\mu = 0$), you can pass your data directly to any of the $p$-value combination functions: ```{r} # Test the null hypothesis that mu = 0 using Edgington's method p_val_0 <- p_edgington( estimates = estimates, SEs = SEs, mu = 0, input_p = "two.sided", output_p = "two.sided" ) p_val_0 ``` Because these functions are vectorized over `mu`, you can also pass a vector of hypotheses (e.g., `mu = c(-0.5, 0, 0.5)`) to generate a sequence of combined $p$-values without generating a full `confMeta` object. ## 2. Visualization of the `confMeta` object The package also contains an `autoplot` method that can be used to visualize the $p$-value function. The documentation for this function can be inspected by running ```{r, eval = FALSE} ?autoplot.confMeta ``` By default, `autoplot()` returns a two-panel figure containing both the $p$-value function (drapery plot) and a standard forest plot. ```{r} # show the p-value function autoplot(cm_edgington, type = "p") ``` ```{r} # show the forest plot autoplot(cm_edgington, type = "forest") ``` ```{r, fig.height = 6} # show both autoplot(cm_edgington) ``` * **Top Panel (P-value function plot):** The gray dashed lines represent the two-sided $p$-value functions of the individual studies. The solid colored line is the combined Edgington $p$-value function. The horizontal dashed line marks the significance level corresponding to the chosen confidence level (e.g., $\alpha = 0.05$ for a 95% CI). The points where the combined curve intersects this horizontal line determine the lower and upper bounds of the combined confidence interval. * **Bottom Panel (Forest Plot):** This panel displays the standard study-level estimates with 95% CIs. The diamonds at the bottom represent the combined estimates. ## 3. Comparing Multiple Combination Methods It is also possible to plot multiple methods side-by-side. This can be done by creating more 'confMeta' objects. For this example, let's use Fisher's method (log-product of $p$-values) and the Harmonic Mean method. ```{r} cm_fisher <- confMeta( estimates = estimates, SEs = SEs, study_names = study_names, fun = p_fisher, fun_name = "Fisher", input_p = "two.sided" ) cm_hmean <- confMeta( estimates = estimates, SEs = SEs, study_names = study_names, fun = p_hmean, fun_name = "Harmonic Mean", distr = "chisq" ) ``` Now, we simply pass all three objects into `autoplot()`: ```{r, fig.height=8} autoplot(cm_edgington, cm_fisher, cm_hmean) ``` ## 4. More Options: Weights and Heterogeneity `confMeta` allows you to adjust the $p$-value functions to account for varying study precision and between-study heterogeneity. ### Weighted Methods Certain combination rules, such as `p_edgington_w` or `p_stouffer`, accept study weights. A common approach is to weight studies by the inverse of their standard errors to give more precise studies greater influence on the combined $p$-value. ```{r} weights <- 1 / SEs cm_weighted <- confMeta( estimates = estimates, SEs = SEs, fun = p_edgington_w, fun_name = "Edgington (Weighted)", w = weights, input_p = "two.sided" ) ``` ### Heterogeneity adjustments By default, the $p$-value functions are evaluated under a fixed-effect model (`heterogeneity = "none"`). You can change this by applying either an **additive** or **multiplicative** heterogeneity correction. These corrections can be achieved using `estimate_tau2()` and `estimate_phi()`. It is necessary to compute these between-study variance parameters from your data before passing them into the main combination functions. **1. Additive Heterogeneity (Random Effects)** The additive model assumes a between-study variance of $\tau^2$. You can estimate this using the `estimate_tau2()` function, which is a wrapper of `meta::metagen()`. Here, we use the Paule-Mandel ("PM") estimator: ```{r} # Estimate tau^2 tau2_est <- estimate_tau2(estimates, SEs, method.tau = "PM") # Create a random-effects confMeta object using additive heterogeneity cm_additive <- confMeta( estimates = estimates, SEs = SEs, study_names = study_names, fun = p_edgington, fun_name = "Edgington (Additive RE)", heterogeneity = "additive", tau2 = tau2_est, input_p = "two.sided" ) ``` **2. Multiplicative Heterogeneity** Alternatively, the multiplicative model assumes the variances are scaled by $\phi$. You can estimate this using the `estimate_phi()` function: ```{r} # Estimate phi phi_est <- estimate_phi(estimates, SEs) # Create a confMeta object using multiplicative heterogeneity cm_multiplicative <- confMeta( estimates = estimates, SEs = SEs, study_names = study_names, fun = p_edgington, fun_name = "Edgington (Multiplicative)", heterogeneity = "multiplicative", phi = phi_est, input_p = "two.sided" ) ``` By supplying these adjusted objects to `autoplot()`, you can visually compare how the choice of heterogeneity model impacts the width of your combined confidence intervals. ```{r, fig.height=6} autoplot(cm_edgington, cm_additive, cm_multiplicative) ``` ## 5. Integration with bayesmeta It is also possible to visually compare frequentist $p$-value combinations with Bayesian approaches. When a `bayesmeta` object (from package `bayesmeta`) is provided to the `autoplot()` function, its posterior summary is added to the forest plot. ```{r, message=FALSE, warning=FALSE} if (requireNamespace("bayesmeta", quietly = TRUE)) { # Run a standard bayesmeta model bm <- bayesmeta::bayesmeta(y = estimates, sigma = SEs) # Add the bayesmeta posterior diamond to the forest plot autoplot(cm_edgington, type = "forest", bayesmeta = bm) } ``` ## 5. Mantel-Haenszel Pooling Sometimes, data are reported in the form of 2x2 contingency tables. To use `confMeta`, you first need to convert these counts into effect estimates and standard errors. We highly recommend using the `escalc()` function from the `metafor` package for this step. Once you have the estimates, `confMeta` can easily integrate the raw 2x2 tables to provide Mantel-Haenszel (MH) pooling as fixed effects method. ```{r} # Simulated 2x2 table data for 3 clinical trials clinical_data <- data.frame( ai = c(12, 5, 22), # Events in experimental group bi = c(88, 45, 78), # Non-events in experimental group ci = c(18, 10, 30), # Events in control group di = c(82, 40, 70), # Non-events in control group n1i = c(100, 50, 100), n2i = c(100, 50, 100) ) # Use metafor::escalc to obtain log Odds Ratios (yi) and variances (vi) if (requireNamespace("metafor", quietly = TRUE)) { es <- metafor::escalc( measure = "OR", ai = ai, bi = bi, ci = ci, di = di, data = clinical_data ) # Create a confMeta object using MH pooling for the Fixed Effect reference cm_clinical <- confMeta( estimates = es$yi, SEs = sqrt(es$vi), # Remember to take the square root of the variance! study_names = c("Trial 1", "Trial 2", "Trial 3"), fun = p_fisher, fun_name = "Fisher (Clinical)", MH = TRUE, # Turn on Mantel-Haenszel pooling table_2x2 = clinical_data, # Provide the raw table measure = "OR" # Specify the effect measure ) # Notice how the "Fixed effect" comparison now uses the MH method cm_clinical$comparison_cis } ```