Skip to contents

nmfae.cv performs k-fold cross-validation by splitting columns (samples) of \(Y_1\) and \(Y_2\) into div folds. For each fold, the model \(Y_1 \approx X_1 \Theta X_2 Y_2\) is fitted on the training samples and predictive performance is evaluated on the held-out samples.

When Y2 is a kernel matrix created by nmfkc.kernel (detected via attributes), the symmetric kernel splitting convention is used: Y2[train, train] for training and Y2[train, test] for prediction.

Usage

nmfae.cv(Y1, Y2 = Y1, rank = 2, rank.encoder = rank, ...)

Arguments

Y1

Output matrix \(Y_1\) (P1 x N). Non-negative.

Y2

Input matrix \(Y_2\) (P2 x N), or a kernel matrix (N x N). Default is Y1 (autoencoder).

rank

Integer. Rank of the decoder basis. Default is 2.

rank.encoder

Integer. Rank of the encoder basis. Default is rank.

...

Additional arguments passed to nmfae (e.g., epsilon, maxit, Y1.weights). Also accepts: nfolds (number of folds, default 5; div also accepted), seed (integer seed, default 123), shuffle (logical, default TRUE). For backward compatibility, Q, R are accepted as aliases for rank, rank.encoder.

Value

A list with components:

objfunc

Mean squared error per valid element over all folds.

sigma

Residual standard error (RMSE), same scale as \(Y_1\).

objfunc.block

Per-fold squared error totals.

block

Integer vector of fold assignments (1, ..., div) for each column.

Examples

Y <- t(iris[1:30, 1:4])
res <- nmfae.cv(Y, rank = 2, rank.encoder = 2, nfolds = 5, maxit = 500)
res$sigma
#> [1] 0.1429115