Skip to contents

nmfkc.cv performs k-fold cross-validation for the tri-factorization model \(Y \approx X C A = X B\), where

  • \(Y(P,N)\) is the observation matrix,

  • \(A(R,N)\) is the covariate (or kernel) matrix,

  • \(X(P,Q)\) is the basis matrix (with \(Q \le \min(P,N)\)),

  • \(C(Q,R)\) is the parameter matrix, and

  • \(B(Q,N)\) is the coefficient matrix (\(B = C A\)).

Given \(Y\) (and optionally \(A\)), \(X\) and \(C\) are fitted on each training split and predictive performance is evaluated on the held-out split.

Usage

nmfkc.cv(Y, A = NULL, Q = 2, ...)

Arguments

Y

Observation matrix.

A

Covariate matrix. If NULL, the identity matrix is used.

Q

Rank of the basis matrix \(X\); must satisfy \(Q \le \min(P,N)\).

...

Additional arguments controlling CV and the internal nmfkc call:

Y.weights

Optional numeric matrix or vector; 0 indicates missing/ignored values.

div

Number of folds (\(k\)); default: 5.

seed

Integer seed for reproducible partitioning; default: 123.

shuffle

Logical. If TRUE (default), randomly shuffles samples (standard CV); if FALSE, splits sequentially (block CV; recommended for time series).

Arguments passed to nmfkc

e.g., gamma (B.L1), epsilon, maxit, method ("EU" or "KL"), X.restriction, X.init, etc.

Value

A list with components:

objfunc

Mean loss per valid entry over all folds (MSE for method="EU").

sigma

Residual standard error (RMSE). Available only if method="EU"; on the same scale as Y.

objfunc.block

Loss for each fold.

block

Vector of fold indices (1, …, div) assigned to each column of \(Y\).

Examples

# Example 1 (with explicit covariates):
Y <- matrix(cars$dist, nrow = 1)
A <- rbind(1, cars$speed)
res <- nmfkc.cv(Y, A, Q = 1)
res$objfunc
#> [1] 286.3378

# Example 2 (kernel A and beta sweep):
Y <- matrix(cars$dist, nrow = 1)
U <- matrix(c(5, 10, 15, 20, 25), nrow = 1)
V <- matrix(cars$speed, nrow = 1)
betas <- 25:35/1000
obj <- numeric(length(betas))
for (i in seq_along(betas)) {
  A <- nmfkc.kernel(U, V, beta = betas[i])
  obj[i] <- nmfkc.cv(Y, A, Q = 1, div = 10)$objfunc
}
betas[which.min(obj)]
#> [1] 0.031