--- title: "Getting Started with GWPR.light 1.0.0" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Getting Started with GWPR.light 1.0.0} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ``` ## Overview `GWPR.light` 1.0.0 provides a modern, `sf`-first API for Geographically Weighted Panel Regression (GWPR). The public interface consists of four functions: - `gwpr()` — full pipeline (bandwidth search + fitting + diagnostics) - `select_bandwidth()` — standalone bandwidth optimisation - `fit_gwpr()` — fit with a known bandwidth - `diagnose_gwpr()` — diagnostic tests on a fitted model All functions accept panel data as a plain `data.frame` and spatial information as an `sf` object. The `workers` argument controls parallel execution; the default `workers = 1` runs serially and is safe in all environments. ## Minimal linear GWPR example ```{r minimal-linear} library(GWPR.light) library(sf) set.seed(42) # Simulate a tiny spatial panel: 6 units, 4 time periods n_units <- 6 n_time <- 4 pts <- sf::st_as_sf( data.frame( id = 1:n_units, X = c(0, 1, 2, 0, 1, 2), Y = c(0, 0, 0, 1, 1, 1) ), coords = c("X", "Y"), crs = NA_integer_ ) dat <- data.frame( id = rep(1:n_units, each = n_time), time = rep(1:n_time, n_units), x1 = rnorm(n_units * n_time), x2 = rnorm(n_units * n_time) ) dat$y <- 1.5 * dat$x1 - 0.8 * dat$x2 + rnorm(n_units * n_time, sd = 0.3) # Fit with a known bandwidth (skip automatic search for speed) fit <- fit_gwpr( formula = y ~ x1 + x2, data = dat, spatial = pts, id = "id", time = "time", bandwidth = 2, family = "gaussian", model = "pooling", workers = 1 ) print(fit) ``` ## Accessing results ```{r results} # Overall goodness-of-fit metrics str(fit$metrics) # Per-unit spatial coefficients (one row per spatial unit) if (!is.null(fit$spatial_results)) { head(fit$spatial_results) } ``` ## Bandwidth selection (grid search) ```{r bandwidth-search} bw <- select_bandwidth( formula = y ~ x1 + x2, data = dat, spatial = pts, id = "id", time = "time", family = "gaussian", model = "pooling", method = "grid", control = list(lower = 0.5, upper = 3, step = 0.5), workers = 1 ) print(bw) cat("Best bandwidth:", bw$best_bandwidth, "\n") ``` ## Full pipeline with gwpr() ```{r full-pipeline} # Use the best bandwidth found above to avoid re-running search full_fit <- gwpr( formula = y ~ x1 + x2, data = dat, spatial = pts, id = "id", time = "time", bandwidth = bw$best_bandwidth, family = "gaussian", model = "pooling", diagnostics = FALSE, # skip diagnostics for speed workers = 1 ) print(full_fit) ``` ## Diagnostics ```{r diagnostics} diag_result <- diagnose_gwpr( full_fit, diagnostics = c("f_test", "hausman", "lm_test") ) print(diag_result) ``` ## Long-running examples The following code illustrates automatic bandwidth search via SGD and a binomial (logistic) GWPR. These are wrapped in `\donttest{}` in the function documentation because they may take more than a few seconds on larger datasets. ```{r long-examples, eval=FALSE} # Automatic SGD bandwidth search + fit (may take several seconds) fit_auto <- gwpr( formula = y ~ x1 + x2, data = dat, spatial = pts, id = "id", time = "time", bandwidth_method = "sgd", bandwidth_control = list(n_iter = 20, step_size = 0.1), workers = 1, seed = 123 ) # Binomial GWPR dat$y_bin <- as.integer(dat$y > 0) fit_logit <- fit_gwpr( formula = y_bin ~ x1 + x2, data = dat, spatial = pts, id = "id", time = "time", bandwidth = 2, family = "binomial", model = "pooling", workers = 1 ) ```