nmfkc.cv performs k-fold cross-validation for the tri-factorization model
\(Y \approx X C A = X B\), where
\(Y(P,N)\) is the observation matrix,
\(A(R,N)\) is the covariate (or kernel) matrix,
\(X(P,Q)\) is the basis matrix (with \(Q \le \min(P,N)\)),
\(C(Q,R)\) is the parameter matrix, and
\(B(Q,N)\) is the coefficient matrix (\(B = C A\)).
Given \(Y\) (and optionally \(A\)), \(X\) and \(C\) are fitted on each training split and predictive performance is evaluated on the held-out split.
nmfkc.cv(Y, A = NULL, Q = 2, ...)Observation matrix.
Covariate matrix. If NULL, the identity matrix is used.
Rank of the basis matrix \(X\); must satisfy \(Q \le \min(P,N)\).
Additional arguments controlling CV and the internal nmfkc call:
Y.weightsOptional numeric matrix or vector; 0 indicates missing/ignored values.
divNumber of folds (\(k\)); default: 5.
seedInteger seed for reproducible partitioning; default: 123.
shuffleLogical. If TRUE (default), randomly shuffles samples (standard CV);
if FALSE, splits sequentially (block CV; recommended for time series).
nmfkce.g., gamma (B.L1), epsilon,
maxit, method ("EU" or "KL"), X.restriction, X.init, etc.
A list with components:
objfuncMean loss per valid entry over all folds (MSE for method="EU").
sigmaResidual standard error (RMSE). Available only if method="EU"; on the same scale as Y.
objfunc.blockLoss for each fold.
blockVector of fold indices (1, …, div) assigned to each column of \(Y\).
# Example 1 (with explicit covariates):
Y <- matrix(cars$dist, nrow = 1)
A <- rbind(1, cars$speed)
res <- nmfkc.cv(Y, A, Q = 1)
res$objfunc
#> [1] 286.3378
# Example 2 (kernel A and beta sweep):
Y <- matrix(cars$dist, nrow = 1)
U <- matrix(c(5, 10, 15, 20, 25), nrow = 1)
V <- matrix(cars$speed, nrow = 1)
betas <- 25:35/1000
obj <- numeric(length(betas))
for (i in seq_along(betas)) {
A <- nmfkc.kernel(U, V, beta = betas[i])
obj[i] <- nmfkc.cv(Y, A, Q = 1, div = 10)$objfunc
}
betas[which.min(obj)]
#> [1] 0.031