Skip to contents

nmfkc.ecv performs k-fold cross-validation by randomly holding out individual elements of the data matrix (element-wise), assigning them a weight of 0 via Y.weights, and evaluating the reconstruction error on those held-out elements.

This method (also known as Wold's CV) is theoretically robust for determining the optimal rank (Q) in NMF. This function supports vector input for Q, allowing simultaneous evaluation of multiple ranks on the same folds.

For symmetric (network) data use nmfkc.net.ecv, which creates upper-triangle folds to prevent information leakage through the symmetric entries \(Y_{ij} = Y_{ji}\). Passing the old Y.symmetric argument here is no longer supported and stops with a redirect message.

Usage

nmfkc.ecv(Y, A = NULL, rank = 1:3, data, ...)

Arguments

Y

Observation matrix, or a formula (see nmfkc for Formula Mode).

A

Covariate matrix. Ignored when Y is a formula.

rank

Vector of ranks to evaluate (e.g., 1:5). For backward compatibility, Q is accepted via ....

data

A data frame (required when Y is a formula with column names).

...

Additional arguments passed to nmfkc (e.g., method="EU"). Also accepts: nfolds (number of folds, default 5; div also accepted), seed (integer seed, default 123).

Value

A list with components:

objfunc

Numeric vector containing the Mean Squared Error (MSE) for each Q.

sigma

Numeric vector containing the Residual Standard Error (RMSE) for each Q. Only available if method="EU".

objfunc.fold

List of length equal to Q vector. Each element contains the MSE values for the k folds.

folds

A list of length div, containing the linear indices of held-out elements for each fold (shared across all Q).

References

Wold, S. (1978). Cross-validatory estimation of the number of components in factor and principal components models. Technometrics, 20(4), 397–405. doi:10.1080/00401706.1978.10489693 Owen, A. B., & Perry, P. O. (2009). Bi-cross-validation of the SVD and the nonnegative matrix factorization. Ann. Appl. Stat. 3(2), 564–594. doi:10.1214/08-AOAS227 (cross-validation of the NMF rank; see also nmfkc.bicv).

See also

nmfkc, nmfkc.cv; other rank-selection criteria: nmfkc.rank, nmfkc.bicv, nmfkc.consensus, nmfkc.ard.

Examples

# Element-wise CV to select rank
Y <- t(iris[1:30, 1:4])
res <- nmfkc.ecv(Y, rank = 1:2, nfolds = 3)
#> Performing Element-wise CV for Q = 1,2 (3-fold)...
res$objfunc
#>       Q=1       Q=2 
#> 0.2597611 0.4376291