--- title: "Publication-Ready Visualisation" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Publication-Ready Visualisation} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r setup, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>", eval = FALSE ) ``` ## Overview Two functions produce publication-ready figures and tables with minimal post-processing: | Function | Output | Typical use | |---|---|---| | `plot_forest()` | Forest plot (PNG / PDF / JPG / TIFF) | Regression results from `assoc_*()` | | `plot_tableone()` | Table 1 (DOCX / HTML / PDF / PNG) | Baseline characteristics | When `save = TRUE`, both functions write all supported formats in a single call and return the plot/table object invisibly for further customisation. --- ## `plot_forest()` — Forest Plot ### Minimal example `plot_forest()` takes a data frame whose **first column** is the row label, plus any additional display columns. The CI graphic and formatted `OR (95% CI)` text column are inserted automatically. ```{r forest-minimal} library(ukbflow) df <- data.frame( item = c("Exposure vs. control", "Unadjusted", "Fully adjusted"), `Cases/N` = c("", "89 / 4 521", "89 / 4 521"), p_value = c(NA_real_, 0.001, 0.006), check.names = FALSE ) p <- plot_forest( data = df, est = c(NA, 1.52, 1.43), lower = c(NA, 1.18, 1.11), upper = c(NA, 1.96, 1.85), ci_column = 3L, indent = c(0L, 1L, 1L), p_cols = "p_value", xlim = c(0.5, 3.0) ) plot(p) ``` ### Building the input data frame from `assoc_*()` results The output of `assoc_coxph()` (and siblings) can be reshaped directly into the format expected by `plot_forest()`: ```{r forest-from-assoc} dt <- ops_toy(scenario = "association") dt <- dt[dm_timing != 1L] res <- assoc_coxph( data = dt, outcome_col = "dm_status", time_col = "dm_followup_years", exposure_col = "p20116_i0", covariates = c("bmi_cat", "tdi_cat", "p1558_i0") ) res <- as.data.frame(res) # Reshape: one row per model, label column first df2 <- data.frame( item = c("Smoking status", as.character(res$model)), `N` = c("", paste0(res$n, " / ", res$n_events)), p_value = c(NA_real_, res$p_value), check.names = FALSE ) p <- plot_forest( data = df2, est = c(NA, res$HR), lower = c(NA, res$CI_lower), upper = c(NA, res$CI_upper), ci_column = 3L, indent = c(0L, rep(1L, nrow(res))), p_cols = "p_value", xlim = c(0.5, 2.5) ) plot(p) ``` ### Key parameters **CI appearance** ```{r forest-ci} # uses df, est, lower, upper from the minimal example above p <- plot_forest( data = df, est = est, lower = lower, upper = upper, ci_column = 3L, ci_col = c("grey50", "steelblue", "steelblue"), # per-row colours ci_sizes = 0.5, # point size ci_Theight = 0.15, # cap height ref_line = 1, # reference line (use 0 for beta coefficients) xlim = c(0.2, 5), ticks_at = c(0.5, 1, 2, 3) ) ``` **Row labels and indentation** ```{r forest-indent} # indent = 0 → bold parent row; indent >= 1 → indented sub-row (plain) p <- plot_forest( data = df, est = est, lower = lower, upper = upper, ci_column = 3L, indent = c(0L, 1L, 1L), # parent + 2 sub-rows bold_label = c(TRUE, FALSE, FALSE) # explicit control (overrides indent default) ) ``` **P-value formatting** ```{r forest-pval} # p_cols: column names in data that contain raw numeric p-values. # Values < 10^(-p_digits) are displayed as e.g. "<0.001". # bold_p = TRUE bolds all p < p_threshold (default 0.05). p <- plot_forest( data = df, est = est, lower = lower, upper = upper, ci_column = 3L, p_cols = "p_value", p_digits = 3L, bold_p = TRUE, p_threshold = 0.05 ) ``` **Column headers and alignment** `header` renames all columns in the *final* rendered table. The final table always has `ncol(data) + 2` columns: the original columns, plus the `gap_ci` graphic column and the auto-generated `OR (95% CI)` text column. Pass `""` for the gap column position. ```{r forest-header} # data has 3 columns → final table has 5 columns (original 3 + gap_ci + OR label) # Layout with ci_column = 3L: item | Cases/N | gap_ci | OR (95% CI) | p_value p <- plot_forest( data = df, est = est, lower = lower, upper = upper, ci_column = 3L, header = c("Comparison", "Cases / N", "", "HR (95% CI)", "P-value") # col 1 col 2 gap OR label col 5 ) ``` `align` controls per-column text alignment across all `ncol(data) + 2` columns: `-1` = left, `0` = centre, `1` = right. `NULL` (default) left-aligns column 1 and centres the rest. ```{r forest-align} p <- plot_forest( data = df, est = est, lower = lower, upper = upper, ci_column = 3L, align = c(-1L, 0L, 0L, 0L, 1L) # label left | Cases/N centre | gap | OR centre | p right ) ``` **Background and borders** ```{r forest-style} p <- plot_forest( data = df, est = est, lower = lower, upper = upper, ci_column = 3L, background = "zebra", # "zebra" | "bold_label" | "none" bg_col = "#F0F0F0", # shading colour border = "three_line", # "three_line" | "none" border_width = 3 # scalar or length-3 vector (top / mid / bottom) ) ``` **Layout and saving** ```{r forest-save} # uses df, est, lower, upper from the minimal example above p <- plot_forest( data = df, est = est, lower = lower, upper = upper, ci_column = 3L, row_height = NULL, # auto (8 / 12 / 10 / 15 mm); or scalar/vector col_width = NULL, # auto (rounds up to nearest 5 mm) save = TRUE, dest = "forest_main", # extension ignored; all 4 formats saved save_width = 20, # cm save_height = NULL # auto: nrow(data) * 0.9 + 3 cm ) ``` > All four formats (PNG, PDF, JPG, TIFF) are written at **300 dpi** with a > white background. The function returns the plot object invisibly; display > with `plot(p)` or `grid::grid.draw(p)`. --- ## `plot_tableone()` — Baseline Characteristics Table ### Minimal example ```{r tableone-minimal} library(gtsummary) data(trial) # built-in gtsummary dataset plot_tableone( data = trial, vars = c("age", "marker", "grade"), strata = "trt", save = FALSE ) ``` ### With SMD, custom labels, and export ```{r tableone-full} plot_tableone( data = trial, vars = c("age", "marker", "grade", "stage"), strata = "trt", label = list(age ~ "Age (years)", marker ~ "Marker level (ng/mL)"), add_p = TRUE, # Wilcoxon / chi-squared p-values; formatted as <0.001 add_smd = TRUE, overall = TRUE, dest = "table1", save = TRUE ) ``` ### Key parameters **Variable types and statistics** ```{r tableone-types} dt <- as.data.frame(ops_toy(scenario = "association")) plot_tableone( data = dt, vars = c("p21022", "p21001_i0", "p31", "p20116_i0"), strata = "dm_status", type = list(p21022 = "continuous2"), # show median + IQR statistic = list( all_continuous() ~ "{mean} ({sd})", all_categorical() ~ "{n} ({p}%)" ), digits = list(p21022 ~ 1, p21001_i0 ~ 1), missing = "ifany", # show missing counts when present save = FALSE ) ``` **SMD column** The SMD column summarises covariate balance between groups: - Continuous variables: Cohen's *d* (pooled-SD formula) - Categorical variables: RMSD of group proportions ```{r tableone-smd} plot_tableone( data = dt, vars = c("p21022", "p21001_i0", "p31"), strata = "dm_status", add_smd = TRUE, save = FALSE ) ``` **Excluding rows** Use `exclude_labels` to remove specific level rows from the rendered table (e.g. a redundant reference category or an "Unknown" level): ```{r tableone-exclude} plot_tableone( data = dt, vars = c("p31", "p20116_i0"), strata = "dm_status", exclude_labels = "Never", # e.g. remove reference category from display save = FALSE ) ``` **Export formats** When `save = TRUE`, four files are written simultaneously: | Format | Tool | Notes | |---|---|---| | `.docx` | `gt::gtsave()` | Ready for Word submission | | `.html` | `gt::gtsave()` | Interactive preview | | `.pdf` | `pagedown::chrome_print()` | Requires Chrome / Chromium | | `.png` | `webshot2::webshot()` | 2x zoom, table element only | > PDF and PNG rendering requires `pagedown` and `webshot2` respectively. > Install with `install.packages(c("pagedown", "webshot2"))`. --- ## Getting Help - `?plot_forest`, `?plot_tableone` - `vignette("assoc")` — association analysis producing forest plot inputs - [GitHub Issues](https://github.com/evanbio/ukbflow/issues)