Perform k-fold cross-validation for NMF with kernel covariates

nmfkc.cv performs k-fold cross-validation for the tri-factorization model \(Y \approx X C A = X B\), where

\(Y(P,N)\) is the observation matrix,
\(A(R,N)\) is the covariate (or kernel) matrix,
\(X(P,Q)\) is the basis matrix (with \(Q \le \min(P,N)\)),
\(C(Q,R)\) is the parameter matrix, and
\(B(Q,N)\) is the coefficient matrix (\(B = C A\)).

Given \(Y\) (and optionally \(A\)), \(X\) and \(C\) are fitted on each training split and predictive performance is evaluated on the held-out split.

nmfkc.cv(Y, A = NULL, Q = 2, ...)

Arguments

Y

Observation matrix.

A

Covariate matrix. If NULL, the identity matrix is used.

Q

Rank of the basis matrix \(X\); must satisfy \(Q \le \min(P,N)\).

...

Additional arguments controlling CV and the internal nmfkc call:

Y.weights: Optional numeric matrix or vector; 0 indicates missing/ignored values.
div: Number of folds (\(k\)); default: 5.
seed: Integer seed for reproducible partitioning; default: 123.
shuffle: Logical. If TRUE (default), randomly shuffles samples (standard CV); if FALSE, splits sequentially (block CV; recommended for time series).
Arguments passed to nmfkc: e.g., gamma (B.L1), epsilon, maxit, method ("EU" or "KL"), X.restriction, X.init, etc.

Value

A list with components:

objfunc: Mean loss per valid entry over all folds (MSE for method="EU").
sigma: Residual standard error (RMSE). Available only if method="EU"; on the same scale as Y.
objfunc.block: Loss for each fold.
block: Vector of fold indices (1, …, div) assigned to each column of \(Y\).

Examples

# Example 1 (with explicit covariates):
Y <- matrix(cars$dist, nrow = 1)
A <- rbind(1, cars$speed)
res <- nmfkc.cv(Y, A, Q = 1)
res$objfunc
#> [1] 286.3378

# Example 2 (kernel A and beta sweep):
Y <- matrix(cars$dist, nrow = 1)
U <- matrix(c(5, 10, 15, 20, 25), nrow = 1)
V <- matrix(cars$speed, nrow = 1)
betas <- 25:35/1000
obj <- numeric(length(betas))
for (i in seq_along(betas)) {
  A <- nmfkc.kernel(U, V, beta = betas[i])
  obj[i] <- nmfkc.cv(Y, A, Q = 1, div = 10)$objfunc
}
betas[which.min(obj)]
#> [1] 0.031

Perform k-fold cross-validation for NMF with kernel covariates

Arguments

Value

See also

Examples