Sample-wise k-fold Cross-Validation for nmfae

nmfae.cv performs k-fold cross-validation by splitting columns (samples) of \(Y_1\) and \(Y_2\) into div folds. For each fold, the model \(Y_1 \approx X_1 \Theta X_2 Y_2\) is fitted on the training samples and predictive performance is evaluated on the held-out samples.

When Y2 is a kernel matrix created by nmfkc.kernel (detected via attributes), the symmetric kernel splitting convention is used: Y2[train, train] for training and Y2[train, test] for prediction.

Usage

nmfae.cv(Y1, Y2 = Y1, rank = 2, rank.encoder = rank, ...)

Arguments

Y1: Output matrix \(Y_1\) (P1 x N). Non-negative.
Y2: Input matrix \(Y_2\) (P2 x N), or a kernel matrix (N x N). Default is Y1 (autoencoder).
rank: Integer. Rank of the decoder basis. Default is 2.
rank.encoder: Integer. Rank of the encoder basis. Default is rank.
...: Additional arguments passed to nmfae (e.g., epsilon, maxit, Y1.weights). Also accepts: nfolds (number of folds, default 5; div also accepted), seed (integer seed, default 123), shuffle (logical, default TRUE). For backward compatibility, Q, R are accepted as aliases for rank, rank.encoder.

Value

A list with components:

objfunc: Mean squared error per valid element over all folds.
sigma: Residual standard error (RMSE), same scale as \(Y_1\).
objfunc.block: Per-fold squared error totals.
block: Integer vector of fold assignments (1, ..., div) for each column.

Lifecycle

This function is experimental. The interface may change in future versions; details are to be described in an upcoming paper.

Examples

Y <- t(iris[1:30, 1:4])
res <- nmfae.cv(Y, rank = 2, rank.encoder = 2, nfolds = 5, maxit = 500)
res$sigma
#> [1] 0.1429115