nmfkc.cv performs k-fold cross-validation for the tri-factorization model
\(Y \approx X C A = X B\), where
\(Y(P,N)\) is the observation matrix,
\(A(R,N)\) is the covariate (or kernel) matrix,
\(X(P,Q)\) is the basis matrix (with \(Q \le \min(P,N)\)),
\(C(Q,R)\) is the parameter matrix, and
\(B(Q,N)\) is the coefficient matrix (\(B = C A\)).
Given \(Y\) (and optionally \(A\)), \(X\) and \(C\) are fitted on each training split and predictive performance is evaluated on the held-out split.
Arguments
- Y
Observation matrix.
- A
Covariate matrix. If
NULL, the identity matrix is used.- Q
Rank of the basis matrix \(X\); must satisfy \(Q \le \min(P,N)\).
- ...
Additional arguments controlling CV and the internal
nmfkccall:Y.weightsOptional numeric matrix or vector; 0 indicates missing/ignored values.
divNumber of folds (\(k\)); default:
5.seedInteger seed for reproducible partitioning; default:
123.shuffleLogical. If
TRUE(default), randomly shuffles samples (standard CV); ifFALSE, splits sequentially (block CV; recommended for time series).- Arguments passed to
nmfkc e.g.,
gamma(B.L1),epsilon,maxit,method("EU"or"KL"),X.restriction,X.init, etc.
Value
A list with components:
objfuncMean loss per valid entry over all folds (MSE for
method="EU").sigmaResidual standard error (RMSE). Available only if
method="EU"; on the same scale asY.objfunc.blockLoss for each fold.
blockVector of fold indices (1, …,
div) assigned to each column of \(Y\).
Examples
# Example 1 (with explicit covariates):
Y <- matrix(cars$dist, nrow = 1)
A <- rbind(1, cars$speed)
res <- nmfkc.cv(Y, A, Q = 1)
res$objfunc
#> [1] 286.3378
# Example 2 (kernel A and beta sweep):
Y <- matrix(cars$dist, nrow = 1)
U <- matrix(c(5, 10, 15, 20, 25), nrow = 1)
V <- matrix(cars$speed, nrow = 1)
betas <- 25:35/1000
obj <- numeric(length(betas))
for (i in seq_along(betas)) {
A <- nmfkc.kernel(U, V, beta = betas[i])
obj[i] <- nmfkc.cv(Y, A, Q = 1, div = 10)$objfunc
}
betas[which.min(obj)]
#> [1] 0.031