This vignette demonstrates NMF-RE (Non-negative
Matrix Factorization with Random Effects), a mixed-effects extension of
NMF implemented in the nmfkc package.
Standard NMF with covariates models the data as: \[Y \approx X \Theta A\]
where \(\Theta A\) captures systematic (fixed) effects of covariates \(A\) on latent scores. However, in many applications — such as longitudinal studies, panel data, or clustered observations — individuals exhibit unit-specific deviations that cannot be explained by covariates alone.
NMF-RE addresses this by adding random effects \(U\): \[Y = X(\Theta A + U) + \mathcal{E}\]
The random effects \(U\) follow a ridge-type prior controlled by variance \(\tau^2\), and the penalty \(\lambda = \sigma^2 / \tau^2\) determines the degree of shrinkage.
df.rate
controls the effective degrees of freedom consumed by \(U\), preventing overfitting.We use the Orthodont dataset from the nlme
package: orthodontic distance measurements for 27 children at ages 8,
10, 12, and 14. The covariate of interest is sex (Male/Female).
library(nmfkc)
library(nlme)
#> Warning: package 'nlme' was built under R version 4.4.3
data(Orthodont)
head(Orthodont)
#> Grouped Data: distance ~ age | Subject
#> distance age Subject Sex
#> 1 26.0 8 M01 Male
#> 2 25.0 10 M01 Male
#> 3 29.0 12 M01 Male
#> 4 31.0 14 M01 Male
#> 5 21.5 8 M02 Male
#> 6 22.5 10 M02 MaleEach column of \(Y\) is a subject, each row is a time point (age). Since there are 4 ages and 27 subjects, \(Y\) is \(4 \times 27\).
Y <- matrix(Orthodont$distance, nrow = 4, ncol = 27)
colnames(Y) <- paste0("S", 1:27)
rownames(Y) <- paste("Age", c(8, 10, 12, 14))
Y[, 1:6]
#> S1 S2 S3 S4 S5 S6
#> Age 8 26 21.5 23.0 25.5 20.0 24.5
#> Age 10 25 22.5 22.5 27.5 23.5 25.5
#> Age 12 29 23.0 24.0 26.5 22.5 27.0
#> Age 14 31 26.5 27.5 27.0 26.0 28.5We create a \(2 \times 27\)
covariate matrix with an intercept row and a binary male
indicator.
male <- ifelse(Orthodont$Sex[seq(1, 108, 4)] == "Male", 1, 0)
A <- rbind(intercept = 1, male = male)
A[, 1:6]
#> [,1] [,2] [,3] [,4] [,5] [,6]
#> intercept 1 1 1 1 1 1
#> male 1 1 1 1 1 1Before fitting the model, we use nmfre.dfU.scan() to
scan different regularization levels. The dfU cap rate
limits how many effective degrees of freedom the random effects \(U\) can consume, preventing
overfitting.
The scan evaluates rates from 0.1 to 1.0 (where 1.0 means no cap). Key columns in the output:
dfU stayed below
the cap (TRUE = safe).TRUE = unconstrained).scan_result <- nmfre.dfU.scan(1:10/10, Y, A, rank = 1)
scan_result
#> rate dfU.cap dfU safeguard hit converged tau2 sigma2 ICC
#> 1 0.1 2.7 2.70 FALSE TRUE TRUE 0.443 1 0.0270
#> 2 0.2 5.4 5.40 FALSE TRUE TRUE 0.996 1 0.0588
#> 3 0.3 8.1 5.42 TRUE FALSE TRUE 1.000 1 0.0590
#> 4 0.4 10.8 5.42 TRUE FALSE TRUE 1.000 1 0.0590
#> 5 0.5 13.5 5.42 TRUE FALSE TRUE 1.000 1 0.0590
#> 6 0.6 16.2 5.42 TRUE FALSE TRUE 1.000 1 0.0590
#> 7 0.7 18.9 5.42 TRUE FALSE TRUE 1.000 1 0.0590
#> 8 0.8 21.6 5.42 TRUE FALSE TRUE 1.000 1 0.0590
#> 9 0.9 24.3 5.42 TRUE FALSE TRUE 1.000 1 0.0590
#> 10 1.0 27.0 5.42 TRUE FALSE TRUE 1.000 1 0.0590A good df.rate satisfies two criteria:
Choose the smallest rate where both conditions hold. This balances model flexibility against overfitting.
Based on the scan, we select df.rate = 0.2 and fit the
model with rank = 1 (a single latent trend). The
prefix argument labels the basis.
summary() displays variance components, fit statistics,
and the coefficient table with significance stars.
summary(res)
#> NMF-RE: Y(4,27) = X(4,1) [C(1,2) A + U(1,27)]
#> Iterations: 2 (converged, epsilon = 1e-05)
#> R-squared: 0.5654 (XB+blup), 0.4167 (XB)
#>
#> Variance components:
#> sigma2 = 1 (residual)
#> tau2 = 0.9961 (random effect)
#> lambda = 1.004 (sigma2 / tau2)
#> ICC = 0.0588 (tau2*tr(X'X) / (tau2*tr(X'X) + sigma2*P))
#> dfU = 5.40 <= 5.40 (cap rate = 0.20)
#>
#> Coefficients:
#> Estimate Std. Error (Boot) z value Pr(>z)
#> intercept:Trend1 90.393 2.471 2.452 36.58 <2e-16 ***
#> male:Trend1 9.606 3.056 2.977 3.14 0.0008356 ***
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1Variance components:
Coefficient table (\(\Theta\)):
R-squared:
We plot the growth curves to see how the model captures both population-level trends (fixed effects) and individual deviations (random effects).
age <- c(8, 10, 12, 14)
plot(age, res$XB[, 1], type = "n", ylim = range(Y),
xlab = "Age (years)", ylab = "Distance (mm)",
main = "Orthodont: NMF-RE Growth Curves")
# Plot observed data points
for (j in 1:27) {
pch_j <- ifelse(male[j] == 1, 4, 1)
points(age, Y[, j], pch = pch_j, col = "gray60")
}
# Plot individual predictions (fixed + random effects)
for (j in 1:27) {
lines(age, res$XB.blup[, j], col = "steelblue", lty = 3, lwd = 0.8)
}
# Plot population-level fixed effects (two lines: male and female)
for (j in 1:27) {
lines(age, res$XB[, j], col = "red", lwd = 2)
}
legend("topleft",
legend = c("Fixed effect (male/female)", "Fixed + Random (BLUP)",
"Male (observed)", "Female (observed)"),
lwd = c(2, 1, NA, NA), lty = c(1, 3, NA, NA),
pch = c(NA, NA, 4, 1),
col = c("red", "steelblue", "gray60", "gray60"),
cex = 0.85)The basis \(X\) represents the latent temporal pattern shared across all subjects.
cat("Basis X (temporal pattern):\n")
#> Basis X (temporal pattern):
print(round(res$X, 4))
#> Trend1
#> Age 8 0.2308
#> Age 10 0.2409
#> Age 12 0.2566
#> Age 14 0.2717Since rank = 1, there is one basis vector. Its shape
shows how the latent factor manifests across the four ages. The column
is normalized to sum to 1.
\(\Theta\) maps covariates to latent scores:
cat("Coefficient matrix (Theta):\n")
#> Coefficient matrix (Theta):
print(round(res$C, 4))
#> intercept male
#> Trend1 90.3934 9.6059intercept: Baseline level of the latent trend for
females.male: Additional contribution for males. A significant
positive value indicates that males have higher orthodontic distances on
average.\(U\) captures individual deviations. Subjects with positive \(U\) values grow faster than the population average; negative values indicate slower growth.
barplot(res$U[1, ], names.arg = colnames(Y),
las = 2, cex.names = 0.7,
col = ifelse(male == 1, "steelblue", "salmon"),
main = "Random Effects (U) by Subject",
ylab = "Random effect value")
legend("topright", fill = c("steelblue", "salmon"),
legend = c("Male", "Female"), cex = 0.85)Check that the algorithm converged properly:
cat("Converged:", res$converged, "\n")
#> Converged: TRUE
cat("Iterations:", res$iter, "\n")
#> Iterations: 2
cat("Stop reason:", res$stop.reason, "\n")
#> Stop reason: epsilonplot(res$objfunc.iter, type = "l",
xlab = "Iteration", ylab = "Objective function",
main = "Convergence Trace")Compare the fitted values against the original data:
residuals <- Y - res$XB.blup
cat("Mean absolute residual (BLUP):", round(mean(abs(residuals)), 4), "\n")
#> Mean absolute residual (BLUP): 1.5118
cat("Mean absolute residual (fixed):", round(mean(abs(Y - res$XB)), 4), "\n")
#> Mean absolute residual (fixed): 1.7612
# Fitted vs Observed
plot(as.vector(Y), as.vector(res$XB.blup),
xlab = "Observed", ylab = "Fitted (BLUP)",
main = "Observed vs Fitted", pch = 16, col = "steelblue")
abline(0, 1, col = "red", lwd = 2)To appreciate the value of random effects, compare NMF-RE with a standard NMF covariate model (no random effects).
# Standard NMF with covariates (no random effects)
res_fixed <- nmfkc(Y, A = A, rank = 1)
#> Y(4,27)~X(4,1)C(1,2)A(2,27)=XB(1,27)...0sec
cat("=== Standard NMF (fixed effects only) ===\n")
#> === Standard NMF (fixed effects only) ===
cat("R-squared:", round(1 - sum((Y - res_fixed$XB)^2) / sum((Y - mean(Y))^2), 4), "\n\n")
#> R-squared: 0.4161
cat("=== NMF-RE (fixed + random effects) ===\n")
#> === NMF-RE (fixed + random effects) ===
cat("R-squared (XB): ", round(res$r.squared.fixed, 4), "\n")
#> R-squared (XB): 0.4167
cat("R-squared (XB+blup): ", round(res$r.squared, 4), "\n")
#> R-squared (XB+blup): 0.5654
cat("ICC: ", round(res$ICC, 4), "\n")
#> ICC: 0.0588The improvement from fixed-only \(R^2\) to BLUP \(R^2\) quantifies the contribution of individual random effects.
nmfre.inference()When fitting NMF-RE, you can separate optimization from statistical
inference. First fit with wild.bootstrap = FALSE (faster),
then run inference separately:
# Fit without bootstrap (fast)
res_fast <- nmfre(Y, A, rank = 1, df.rate = 0.2, prefix = "Trend",
wild.bootstrap = FALSE)
# Run inference separately
res_inf <- nmfre.inference(res_fast, Y, A, wild.B = 500)
res_inf$coefficients[, c("Basis", "Covariate", "Estimate", "SE", "p_value")]
#> Basis Covariate Estimate SE p_value
#> 1 Trend1 intercept 90.393445 2.471312 3.306544e-293
#> 2 Trend1 male 9.605879 3.056105 8.356231e-04NMF-RE provides a principled way to model individual heterogeneity in NMF:
| Step | Function | Purpose |
|---|---|---|
| 1 | nmfre.dfU.scan() |
Select regularization level |
| 2 | nmfre() |
Fit the mixed-effects model |
| 3 | summary() |
Examine variance components and inference |
| 4 | nmfre.inference() |
Separated inference (optional) |
| 5 | nmfkc.DOT() |
Visualize with significance stars |
When to use NMF-RE:
For more details on the underlying methodology, see: