--- title: "PMM3: Linear Regression for Symmetric Platykurtic Errors" author: "Serhii Zabolotnii" date: "`r Sys.Date()`" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{PMM3: Linear Regression for Symmetric Platykurtic Errors} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r setup, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.width = 7, fig.height = 5, warning = FALSE, message = FALSE ) ``` ## The Problem: When OLS and PMM2 Are Not Enough Ordinary Least Squares (OLS) is optimal for Gaussian errors. **PMM2** improves upon OLS when errors are **asymmetric** (skewed). But what about errors that are **symmetric yet non-Gaussian**? Many real-world processes produce errors with **lighter tails** than the normal distribution (**platykurtic** errors, $\gamma_4 < 0$). In these cases: - OLS is suboptimal — it does not exploit the known platykurtic structure - PMM2 provides no improvement — it relies on the third moment (skewness), which is zero for symmetric distributions ### Where Platykurtic Errors Arise - **Bounded measurement errors**: sensors with a fixed operating range - **Rounding errors**: uniformly distributed within the rounding interval - **Truncated normal processes**: censored at fixed bounds - **Beta-distributed disturbances**: bounded economic indicators, proportions - **Quality control**: manufacturing tolerances with hard limits --- ## The Solution: PMM3 (Polynomial Maximization Method, S=3) **PMM3** constructs a cubic stochastic polynomial using the fourth and sixth central moments. The key theoretical quantities: - **Excess kurtosis**: $\gamma_4 = m_4 / m_2^2 - 3$ - **Sixth-order coefficient**: $\gamma_6 = m_6 / m_2^3 - 15 m_4/m_2^2 + 30$ - **Efficiency factor**: $g_3 = 1 - \gamma_4^2 / (6 + 9\gamma_4 + \gamma_6)$ The factor $g_3$ is the ratio $\text{Var}(\hat{\beta}_{\text{PMM3}}) / \text{Var}(\hat{\beta}_{\text{OLS}})$. When $g_3 < 1$, PMM3 achieves lower variance than OLS. ### The Newton-Raphson Algorithm PMM3 iteratively refines OLS estimates using the score equation: $$ Z_{1\nu} = \varepsilon_\nu (\kappa - \varepsilon_\nu^2), \quad \kappa = m_4 / m_2 $$ with Jacobian $J = X^\top \text{diag}(3\varepsilon^2 - \kappa) X$. The solver includes step-size limiting and a divergence guard for numerical stability. **Reference:** Based on the theoretical framework of the Polynomial Maximization Method (Zabolotnii et al., 2018). --- ## Getting Started ```{r load_package} library(EstemPMM) set.seed(42) ``` --- ## Example 1: Simple Regression with Uniform Errors Uniform errors are the canonical platykurtic case: $\gamma_4 \approx -1.2$, symmetric, bounded. ### Data Generation ```{r simulate_uniform} n <- 300 # Generate predictor x <- rnorm(n, mean = 5, sd = 2) # True parameters beta_0 <- 10 beta_1 <- 2.5 # Generate uniform errors (platykurtic, gamma4 ≈ -1.2) errors <- runif(n, -sqrt(3), sqrt(3)) # variance = 1 # Response variable y <- beta_0 + beta_1 * x + errors dat <- data.frame(y = y, x = x) ``` ### Visualize Error Distribution ```{r plot_errors_uniform, fig.width=7, fig.height=4} fit_ols_temp <- lm(y ~ x, data = dat) res_ols <- residuals(fit_ols_temp) par(mfrow = c(1, 2)) hist(res_ols, breaks = 30, main = "Residual Distribution (Uniform Errors)", xlab = "Residuals", col = "lightblue", border = "white") qqnorm(res_ols, main = "Q-Q Plot") qqline(res_ols, col = "red", lwd = 2) par(mfrow = c(1, 1)) cat("Skewness (gamma3):", round(pmm_skewness(res_ols), 3), "\n") cat("Excess kurtosis (gamma4):", round(pmm_kurtosis(res_ols) - 3, 3), "\n") ``` The Q-Q plot shows **lighter tails** than normal (S-shaped curve with compressed ends). The skewness is near zero but kurtosis is clearly negative. ### Automatic Method Selection ```{r dispatch_uniform} recommendation <- pmm_dispatch(res_ols) ``` ### Fit PMM3 ```{r fit_pmm3_simple} fit_pmm3 <- lm_pmm3(y ~ x, data = dat) fit_ols <- lm(y ~ x, data = dat) # Display PMM3 summary summary(fit_pmm3) ``` ### Compare with OLS ```{r compare_simple} comparison <- data.frame( Parameter = c("Intercept", "Slope"), True = c(beta_0, beta_1), OLS = coef(fit_ols), PMM3 = coef(fit_pmm3), Diff = coef(fit_pmm3) - coef(fit_ols) ) print(comparison, row.names = FALSE) cat("\nResidual sum of squares:\n") cat(" OLS: ", sum(residuals(fit_ols)^2), "\n") cat(" PMM3: ", sum(residuals(fit_pmm3)^2), "\n") cat("\nTheoretical variance ratio g3 =", fit_pmm3@g_coefficient, "\n") cat("Expected variance reduction:", round((1 - fit_pmm3@g_coefficient) * 100, 1), "%\n") ``` --- ## Example 2: Three-Way Comparison (OLS vs PMM2 vs PMM3) For symmetric platykurtic errors, PMM3 should outperform both OLS and PMM2. PMM2 should offer no improvement since $\gamma_3 \approx 0$. ```{r three_way} # Fit all three methods fit_ols <- lm(y ~ x, data = dat) fit_pmm2 <- lm_pmm2(y ~ x, data = dat) fit_pmm3 <- lm_pmm3(y ~ x, data = dat) comparison3 <- data.frame( Method = c("OLS", "PMM2", "PMM3"), Intercept = c(coef(fit_ols)[1], coef(fit_pmm2)[1], coef(fit_pmm3)[1]), Slope = c(coef(fit_ols)[2], coef(fit_pmm2)[2], coef(fit_pmm3)[2]) ) print(comparison3, row.names = FALSE, digits = 5) cat("\nTrue values: Intercept =", beta_0, ", Slope =", beta_1, "\n") cat("\nEfficiency factors:\n") vf2 <- pmm2_variance_factor(fit_pmm2@m2, fit_pmm2@m3, fit_pmm2@m4) cat(" PMM2 g2 =", vf2$g2, " (improvement:", round((1 - vf2$g2) * 100, 1), "%)\n") cat(" PMM3 g3 =", fit_pmm3@g_coefficient, " (improvement:", round((1 - fit_pmm3@g_coefficient) * 100, 1), "%)\n") ``` With symmetric platykurtic errors, PMM2's $g_2 \approx 1$ (no gain), while PMM3's $g_3 \ll 1$ (substantial gain). --- ## Example 3: Real Data — Auto MPG (Quadratic Regression) The `auto_mpg` dataset contains fuel consumption data for 398 vehicles. The regression of **MPG vs Horsepower** (quadratic model) produces residuals with $|\gamma_3| \approx 0.2$ and $\gamma_4 \approx -1.3$ — a case where PMM3 is appropriate. ```{r auto_mpg_load} data(auto_mpg) # Remove missing horsepower values auto_complete <- na.omit(auto_mpg[, c("mpg", "horsepower")]) # Visualize the relationship (clearly nonlinear) plot(auto_complete$horsepower, auto_complete$mpg, main = "MPG vs Horsepower", xlab = "Horsepower", ylab = "Miles per Gallon", col = "steelblue", pch = 16, cex = 0.8) ``` ### Fit Quadratic Model ```{r auto_mpg_fit} # Quadratic OLS fit_auto_ols <- lm(mpg ~ horsepower + I(horsepower^2), data = auto_complete) # Check residual properties res_auto <- residuals(fit_auto_ols) cat("Residual diagnostics:\n") cat(" Skewness (gamma3):", round(pmm_skewness(res_auto), 3), "\n") cat(" Kurtosis (gamma4):", round(pmm_kurtosis(res_auto) - 3, 3), "\n") sym_test <- test_symmetry(res_auto) cat(" Symmetric:", sym_test$is_symmetric, "\n") # Dispatch recommendation dispatch_auto <- pmm_dispatch(res_auto) ``` ### PMM3 Estimation ```{r auto_mpg_pmm3} # Fit PMM3 (quadratic model) fit_auto_pmm3 <- lm_pmm3(mpg ~ horsepower + I(horsepower^2), data = auto_complete) # Fit PMM2 for comparison fit_auto_pmm2 <- lm_pmm2(mpg ~ horsepower + I(horsepower^2), data = auto_complete, na.action = na.omit) # Compare all three methods comparison_auto <- data.frame( Method = c("OLS", "PMM2", "PMM3"), Intercept = c(coef(fit_auto_ols)[1], coef(fit_auto_pmm2)[1], coef(fit_auto_pmm3)[1]), HP = c(coef(fit_auto_ols)[2], coef(fit_auto_pmm2)[2], coef(fit_auto_pmm3)[2]), HP_sq = c(coef(fit_auto_ols)[3], coef(fit_auto_pmm2)[3], coef(fit_auto_pmm3)[3]) ) print(comparison_auto, row.names = FALSE, digits = 6) cat("\nPMM3 g3 =", fit_auto_pmm3@g_coefficient, " (improvement:", round((1 - fit_auto_pmm3@g_coefficient) * 100, 1), "%)\n") ``` ### Visualize the Fits ```{r auto_mpg_plot, fig.width=7, fig.height=5} hp_seq <- seq(min(auto_complete$horsepower), max(auto_complete$horsepower), length.out = 200) newdata <- data.frame(horsepower = hp_seq) pred_ols <- predict(fit_auto_ols, newdata = newdata) pred_pmm3 <- predict(fit_auto_pmm3, newdata = data.frame(horsepower = hp_seq, `I(horsepower^2)` = hp_seq^2)) plot(auto_complete$horsepower, auto_complete$mpg, main = "MPG vs Horsepower: OLS and PMM3 Quadratic Fits", xlab = "Horsepower", ylab = "Miles per Gallon", col = "gray70", pch = 16, cex = 0.7) lines(hp_seq, pred_ols, col = "blue", lwd = 2) lines(hp_seq, as.numeric(pred_pmm3), col = "red", lwd = 2, lty = 2) legend("topright", legend = c("OLS", "PMM3"), col = c("blue", "red"), lty = c(1, 2), lwd = 2) ``` --- ## Example 4: Real Data — Auto MPG (Linear, PMM2 Case) For contrast, the regression of **MPG vs Acceleration** produces **asymmetric** residuals ($\gamma_3 \approx 0.5$), where PMM2 is appropriate instead. ```{r auto_mpg_acceleration} # Linear model: MPG vs Acceleration fit_accel_ols <- lm(mpg ~ acceleration, data = auto_mpg) res_accel <- residuals(fit_accel_ols) cat("Acceleration residual diagnostics:\n") cat(" Skewness (gamma3):", round(pmm_skewness(res_accel), 3), "\n") cat(" Kurtosis (gamma4):", round(pmm_kurtosis(res_accel) - 3, 3), "\n") # Dispatch dispatch_accel <- pmm_dispatch(res_accel) # Compare fit_accel_pmm2 <- lm_pmm2(mpg ~ acceleration, data = auto_mpg, na.action = na.omit) vf2_accel <- pmm2_variance_factor(fit_accel_pmm2@m2, fit_accel_pmm2@m3, fit_accel_pmm2@m4) cat("\nPMM2 g2 =", vf2_accel$g2, " (improvement:", round((1 - vf2_accel$g2) * 100, 1), "%)\n") ``` This demonstrates how `pmm_dispatch()` guides users to the correct method: **PMM3 for symmetric platykurtic** residuals, **PMM2 for asymmetric** residuals. --- ## Example 5: Multiple Regression PMM3 extends naturally to multiple predictors. ```{r multiple_regression} n <- 300 x1 <- rnorm(n) x2 <- runif(n, -2, 2) x3 <- rnorm(n, 3, 1) # True parameters beta <- c(5, 1.5, -0.8, 2.0) # Beta-symmetric errors (platykurtic, gamma4 ≈ -0.86) errors_beta <- (rbeta(n, 3, 3) - 0.5) * sqrt(12 * 3) y_multi <- beta[1] + beta[2]*x1 + beta[3]*x2 + beta[4]*x3 + errors_beta dat_multi <- data.frame(y = y_multi, x1 = x1, x2 = x2, x3 = x3) # Fit fit_multi_ols <- lm(y ~ x1 + x2 + x3, data = dat_multi) fit_multi_pmm3 <- lm_pmm3(y ~ x1 + x2 + x3, data = dat_multi) comparison_multi <- data.frame( Parameter = c("Intercept", "x1", "x2", "x3"), True = beta, OLS = coef(fit_multi_ols), PMM3 = coef(fit_multi_pmm3), Diff = coef(fit_multi_pmm3) - coef(fit_multi_ols) ) print(comparison_multi, row.names = FALSE, digits = 4) cat("\ng3 =", fit_multi_pmm3@g_coefficient, " (improvement:", round((1 - fit_multi_pmm3@g_coefficient) * 100, 1), "%)\n") ``` --- ## Example 6: No-Intercept Model PMM3 supports models without an intercept term. ```{r no_intercept} # Model through the origin y_noint <- 3 * x1 + errors_beta[1:n] dat_noint <- data.frame(y = y_noint, x1 = x1) fit_noint <- lm_pmm3(y ~ x1 - 1, data = dat_noint) cat("True slope: 3\n") cat("PMM3 estimate:", coef(fit_noint), "\n") cat("Converged:", fit_noint@convergence, "\n") ``` --- ## Interpreting the PMM3 Summary The `summary()` method reports several PMM3-specific quantities: ```{r interpret_summary} summary(fit_pmm3) ``` | Quantity | Meaning | |----------|---------| | **m2, m4, m6** | Central moments of the initial (OLS) residuals | | **gamma4** | Excess kurtosis: negative = platykurtic, zero = Gaussian | | **gamma6** | Sixth-order cumulant coefficient | | **g3** | Theoretical efficiency factor: $\text{Var}(\hat\beta_{\text{PMM3}}) / \text{Var}(\hat\beta_{\text{OLS}})$ | | **kappa** | Moment ratio $m_4/m_2$ used in the NR score equations | | **Convergence** | Whether the Newton-Raphson algorithm converged | | **Iterations** | Number of NR iterations performed | **Rule of thumb:** If $g_3 < 0.90$, PMM3 provides a meaningful improvement over OLS ($> 10\%$ variance reduction). --- ## Adaptive Mode By default, PMM3 computes $\kappa$ from OLS residuals and holds it fixed throughout the Newton-Raphson iterations. In **adaptive mode**, $\kappa$ is re-estimated from the current residuals at each iteration: ```{r adaptive_mode} fit_fixed <- lm_pmm3(y ~ x, data = dat, adaptive = FALSE) fit_adaptive <- lm_pmm3(y ~ x, data = dat, adaptive = TRUE) comparison_adapt <- data.frame( Mode = c("Fixed kappa", "Adaptive kappa"), Intercept = c(coef(fit_fixed)[1], coef(fit_adaptive)[1]), Slope = c(coef(fit_fixed)[2], coef(fit_adaptive)[2]), Iterations = c(fit_fixed@iterations, fit_adaptive@iterations) ) print(comparison_adapt, row.names = FALSE, digits = 5) ``` Adaptive mode can improve estimates when the error distribution departs significantly from what the initial OLS residuals suggest, at the cost of additional iterations. --- ## Diagnostic Plots PMM3 provides four diagnostic plots: ```{r diagnostics_full, fig.height=6} plot(fit_pmm3) ``` 1. **Residuals vs Fitted** — check for non-linearity and heteroscedasticity 2. **Q-Q Plot** — platykurtic residuals show lighter tails (compressed ends) 3. **Scale-Location** — check homoscedasticity 4. **Residual Histogram** — platykurtic distributions appear flatter than a bell curve, often with "chopped" tails --- ## Utility Functions ### Automatic Method Selection ```{r dispatch_utility} # Symmetric platykurtic errors → PMM3 pmm_dispatch(runif(500, -1, 1)) # Asymmetric errors → PMM2 pmm_dispatch(rexp(500) - 1) # Gaussian errors → OLS pmm_dispatch(rnorm(500)) ``` ### Testing Symmetry ```{r symmetry_test} test_symmetry(residuals(fit_ols)) ``` ### Computing Moments ```{r moments_utility} mom <- compute_moments_pmm3(residuals(fit_ols)) cat("m2 =", mom$m2, "\n") cat("m4 =", mom$m4, "\n") cat("m6 =", mom$m6, "\n") cat("gamma4 =", mom$gamma4, "\n") cat("kappa =", mom$kappa, "\n") cat("g3 =", mom$g3, "\n") ``` ### Sixth-Order Cumulant ```{r gamma6_utility} pmm_gamma6(residuals(fit_ols)) ``` ### Variance Factor ```{r variance_factor} pmm3_variance_factor(mom$m2, mom$m4, mom$m6) ``` --- ## When to Use PMM3 **Use PMM3 when:** - Residuals are **symmetric** ($|\gamma_3| \leq 0.3$) - Residuals are **platykurtic** ($\gamma_4 < -0.7$) - Error distributions are bounded (uniform, beta-symmetric, truncated normal) - Sample size is **moderate to large** ($n \geq 100$, ideally $n \geq 200$) **Use PMM2 instead when:** - Residuals are **skewed** ($|\gamma_3| > 0.3$) **Stick with OLS when:** - Residuals are approximately **Gaussian** ($|\gamma_4| < 0.7$, $|\gamma_3| < 0.3$) - Very small samples ($n < 50$) ### Decision Workflow 1. **Fit OLS** as baseline 2. **Check residuals**: `pmm_skewness()`, `pmm_kurtosis()`, `test_symmetry()` 3. **Use `pmm_dispatch()`** for automatic recommendation 4. **Refit** with the recommended method 5. **Compare** coefficients and efficiency factors --- ## Efficiency Summary (Monte Carlo Evidence) Based on Monte Carlo simulations (R = 1000 replications): | Error Distribution | $\gamma_4$ | $g_3$ | Variance Reduction | |-------------------------|------------|--------|-------------------| | Uniform | $-1.20$ | $0.64$ | 36% | | Beta(2,2) symmetric | $-0.86$ | $0.78$ | 22% | | Beta(3,3) symmetric | $-0.57$ | $0.88$ | 12% | | Truncated Normal ($|x| \leq 2$) | $-0.47$ | $0.91$ | 9% | | Gaussian (control) | $0.00$ | $1.00$ | 0% | **Key insight:** The more platykurtic the errors ($\gamma_4$ further from 0), the larger the efficiency gain from PMM3. --- ## Next Steps - **PMM3 Time Series**: See `vignette("pmm3_time_series")` for AR, MA, ARMA, ARIMA models with PMM3 - **PMM2 Introduction**: See `vignette("pmm2_introduction")` for asymmetric error estimation - **PMM2 Time Series**: See `vignette("pmm2_time_series")` for PMM2 time series models - **Bootstrap Inference**: See `vignette("bootstrap_inference")` --- ## References 1. Zabolotnii S., Warsza Z.L., Tkachenko O. (2018). Polynomial Estimation of Linear Regression Parameters for the Asymmetric PDF of Errors. Springer AISC, vol 743. doi:10.1007/978-3-319-77179-3_75 2. Package functions: `?lm_pmm3`, `?pmm_dispatch`, `?compute_moments_pmm3`, `?test_symmetry`, `?pmm_gamma6` 3. GitHub: https://github.com/SZabolotnii/EstemPMM