Estimates the NMF-RE model $$Y = X(\Theta A + U) + \mathcal{E}$$ where \(Y\) (\(P \times N\)) is a non-negative observation matrix, \(X\) (\(P \times Q\)) is a non-negative basis matrix learned from the data, \(\Theta\) (\(Q \times K\)) is a non-negative coefficient matrix capturing systematic covariate effects on latent scores, \(A\) (\(K \times N\)) is a covariate matrix, and \(U\) (\(Q \times N\)) is a random effects matrix capturing unit-specific deviations in the latent score space.
NMF-RE can be viewed as a mixed-effects latent-variable model defined on a reconstruction (mean) structure. The non-negativity constraint on \(X\) induces sparse, parts-based loadings, achieving measurement-side variable selection without an explicit sparsity penalty. Inference on \(\Theta\) provides covariate-side variable selection by identifying which covariates significantly affect which components.
Estimation alternates ridge-type BLUP-like closed-form updates for \(U\) with multiplicative non-negative updates for \(X\) and \(\Theta\). The effective degrees of freedom consumed by \(U\) are monitored and a df-based cap can be enforced to prevent near-saturated fits.
When wild.bootstrap = TRUE, inference on \(\Theta\) is performed
conditional on \((\hat{X}, \hat{U})\) via asymptotic linearization,
a one-step Newton update, and a multiplier (wild) bootstrap, yielding
standard errors, z-values, p-values, and confidence intervals without
repeated constrained re-optimization.
Usage
nmfre(
Y,
A = NULL,
rank = 2,
df.rate = NULL,
wild.bootstrap = TRUE,
epsilon = 1e-05,
maxit = 50000,
...
)Arguments
- Y
Observation matrix (P x N), non-negative.
- A
Covariate matrix (K x N). Default is a row of ones (intercept only).
- rank
Integer. Rank of the basis matrix \(X\). Default is 2. For backward compatibility,
Qis accepted via....- df.rate
Rate for computing the dfU cap (
cap = rate * N * Q). For backward compatibility,dfU.cap.rateis accepted via.... IfNULL(default), runsnmfre.dfU.scaninternally and selects the minimum rate where the cap is not binding. Usenmfre.dfU.scanbeforehand to examine dfU behavior across rates and choose an appropriate value.- wild.bootstrap
Logical. If
TRUE(default), perform wild bootstrap inference on \(\Theta\).- epsilon
Convergence tolerance for relative change in objective (default 1e-5).
- maxit
Maximum number of iterations (default 50000).
- ...
Additional arguments for initialization, variance control, dfU control, optimization, and inference settings.
X.init: Initial basis matrix (P x Q), orNULL. WhenNULL,nmfkcis called internally to generate initial values.C.init: Initial coefficient matrix (Q x K), orNULL. WhenNULL,nmfkcis called internally to generate initial values.U.init: Initial random effects matrix (Q x N), orNULL(all zeros).prefix: Prefix for basis names (default"Basis").sigma2: Initial residual variance (default 1).sigma2.update: Logical. Update \(\sigma^2\) during iterations (defaultTRUE).tau2: Initial random effect variance (default 1).tau2.update: Logical. Update \(\tau^2\) by moment matching (defaultTRUE). Disabled when dfU cap is active.dfU.control: Either"cap"(default) to enforce a cap on dfU, or"off"for no cap.print.trace: Logical. IfTRUE, print progress every 100 iterations (defaultFALSE).seed: Integer seed for reproducibility (default 1).C.p.side: P-value sidedness:"one.sided"(default, for boundary null H0: C=0 vs H1: C>0) or"two.sided".wild.B: Number of wild bootstrap replicates (default 500).wild.seed: Seed for wild bootstrap (default 123).
Value
A list of class "nmfre" with components.
The model is \(Y = X(\Theta A + U) + \mathcal{E}\).
Core matrices
XBasis matrix \(X\) (\(P \times Q\)), columns normalized to sum to 1.
CCoefficient matrix \(\Theta\) (\(Q \times K\)).
URandom effects matrix \(U\) (\(Q \times N\)).
Variance components
sigma2Residual variance \(\hat{\sigma}^2\).
tau2Random effect variance \(\hat{\tau}^2\).
lambdaRidge penalty \(\lambda = \sigma^2 / \tau^2\).
Convergence diagnostics
convergedLogical. Whether the algorithm converged.
stop.reasonCharacter string describing why iteration stopped.
iterNumber of iterations performed.
maxitMaximum iterations setting used.
epsilonConvergence tolerance used.
objfuncFinal objective function value \(\|Y - X(\Theta A + U)\|^2 + \lambda \|U\|^2\).
rel.change.finalFinal relative change in objective.
objfunc.iterNumeric vector of objective values per iteration.
rss.traceNumeric vector of \(\|Y - X(\Theta A + U)\|^2\) per iteration.
Effective degrees of freedom (dfU) diagnostics
dfUFinal effective degrees of freedom \(\mathrm{df}_U = N \sum_q d_q / (d_q + \lambda)\), where \(d_q\) are eigenvalues of \(X'X\).
dfU.capUpper bound imposed on \(\mathrm{df}_U\).
dfU.cap.rateRate used to compute the cap.
dfU.cap.scanResult of
nmfre.dfU.scan, orNULL.lambda.enforcedFinal \(\lambda\) enforced to satisfy the cap.
dfU.hit.capLogical. Whether the cap was binding.
dfU.hit.iterIteration at which the cap first bound.
dfU.frac\(\mathrm{df}_U / (NQ)\), fraction of maximum df.
dfU.cap.frac\(\mathrm{df}_U^{\mathrm{cap}} / (NQ)\).
Fitted matrices
BFixed-effect scores \(\Theta A\) (\(Q \times N\)).
B.probColumn-normalized probabilities from \(\max(\Theta A, 0)\).
B.blupBLUP scores \(\Theta A + U\) (\(Q \times N\)).
B.blup.posNon-negative BLUP scores \(\max(\Theta A + U, 0)\) (\(Q \times N\)).
B.blup.probColumn-normalized probabilities from \(\max(\Theta A + U, 0)\).
XBFitted values from fixed effects \(X \Theta A\) (\(P \times N\)).
XB.blupFitted values including random effects \(X(\Theta A + U)\) (\(P \times N\)).
Fit statistics
r.squaredCoefficient of determination \(R^2\) for \(Y\) vs \(X(\Theta A + U)\) (BLUP).
r.squared.fixedCoefficient of determination \(R^2\) for \(Y\) vs \(X \Theta A\) (fixed effects only).
ICCTrace-based Intraclass Correlation Coefficient. In the NMF-RE model, the conditional covariance of the \(n\)-th observation column is \(\mathrm{Var}(Y_n) = \tau^2 X X^\top + \sigma^2 I_P\), a \(P \times P\) matrix. Unlike a standard random intercept model where the design matrix \(Z\) is a simple indicator (so the ICC reduces to \(\tau^2 / (\sigma^2 + \tau^2)\)), the basis matrix \(X\) plays the role of \(Z\) in a random slopes model, making the variance contribution of \(U\) depend on \(X\). To obtain a scalar summary, we take the trace of each component: $$\mathrm{ICC} = \frac{\tau^2 \, \mathrm{tr}(X^\top X)} {\tau^2 \, \mathrm{tr}(X^\top X) + \sigma^2 P}.$$ This equals the average (over \(P\) dimensions) proportion of per-column variance attributable to the random effects.
Inference on \(\Theta\) (wild bootstrap)
sigma2.used\(\hat{\sigma}^2\) used for inference.
C.vec.covVariance-covariance matrix for \(\mathrm{vec}(\Theta)\) (\(QK \times QK\)).
C.seStandard error matrix for \(\Theta\) (\(Q \times K\)).
C.se.hessSandwich (Hessian-based) SE matrix for \(\Theta\).
C.se.bootBootstrap SE matrix for \(\Theta\).
coefficientsData frame with columns Estimate, Std. Error, z value, Pr(>|z|), and confidence interval bounds for each element of \(\Theta\).
C.ci.lowerLower confidence interval matrix for \(\Theta\) (\(Q \times K\)).
C.ci.upperUpper confidence interval matrix for \(\Theta\) (\(Q \times K\)).
C.boot.sdBootstrap standard deviation matrix for \(\Theta\) (\(Q \times K\)).
C.p.sideP-value sidedness used:
"one.sided"or"two.sided".
References
Satoh, K. (2026). Wild Bootstrap Inference for Non-Negative Matrix Factorization with Random Effects. arXiv:2603.01468. https://arxiv.org/abs/2603.01468
Examples
# Example 1. cars data
Y <- matrix(cars$dist, nrow = 1)
A <- rbind(intercept = 1, speed = cars$speed)
res <- nmfre(Y, A, rank = 1, maxit = 5000)
summary(res)
#> NMF-RE: Y(1,50) = X(1,1) [C(1,2) A + U(1,50)]
#> Iterations: 4 (converged, epsilon = 1e-05)
#> R-squared: 0.9047 (XB+blup), 0.6005 (XB)
#>
#> Variance components:
#> sigma2 = 1 (residual)
#> tau2 = 1 (random effect)
#> lambda = 1 (sigma2 / tau2)
#> ICC = 0.5000 (tau2*tr(X'X) / (tau2*tr(X'X) + sigma2*P))
#> dfU = 25.00 <= 30.00 (cap rate = 0.60)
#>
#> Coefficients:
#> Estimate Std. Error (Boot) z value Pr(>z)
#> intercept:Basis1 0.014 6.427 3.254 0.00 0.4991
#> speed:Basis1 2.851 0.456 0.432 6.25 2.118e-10 ***
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# \donttest{
# Example 2. Orthodont data (nlme)
if (requireNamespace("nlme", quietly = TRUE)) {
Y <- matrix(nlme::Orthodont$distance, 4, 27)
male <- ifelse(nlme::Orthodont$Sex[seq(1, 108, 4)] == "Male", 1, 0)
A <- rbind(intercept = 1, male = male)
# Scan dfU cap rates to choose an appropriate value
nmfre.dfU.scan(1:10/10, Y, A, rank = 1)
# Fit with chosen rate
res <- nmfre(Y, A, rank = 1, df.rate = 0.2)
summary(res)
}# }
#> NMF-RE: Y(4,27) = X(4,1) [C(1,2) A + U(1,27)]
#> Iterations: 2 (converged, epsilon = 1e-05)
#> R-squared: 0.5654 (XB+blup), 0.4167 (XB)
#>
#> Variance components:
#> sigma2 = 1 (residual)
#> tau2 = 0.9961 (random effect)
#> lambda = 1.004 (sigma2 / tau2)
#> ICC = 0.0588 (tau2*tr(X'X) / (tau2*tr(X'X) + sigma2*P))
#> dfU = 5.40 <= 5.40 (cap rate = 0.20)
#>
#> Coefficients:
#> Estimate Std. Error (Boot) z value Pr(>z)
#> intercept:Basis1 90.393 2.471 2.452 36.58 <2e-16 ***
#> male:Basis1 9.606 3.056 2.977 3.14 0.0008356 ***
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1