Element-wise Cross-Validation for nmfae (Wold's CV)

nmfae.ecv performs k-fold element-wise cross-validation by randomly holding out individual elements of \(Y_1\), assigning them a weight of 0 via Y1.weights, and evaluating the reconstruction error on those held-out elements.

This method (also known as Wold's CV) is suitable for determining the optimal rank pair \((Q, R)\) in three-layer NMF. Both rank and rank.encoder accept vector inputs. When rank.encoder = NULL (default), rank.encoder is set equal to rank and pairs are evaluated element-wise (i.e., \((Q_1, R_1), (Q_2, R_2), \dots\)). When rank.encoder is explicitly specified, all combinations of rank and rank.encoder are evaluated via expand.grid.

Usage

nmfae.ecv(Y1, Y2 = Y1, rank = 1:2, rank.encoder = NULL, ...)

Arguments

Y1: Output matrix \(Y_1\) (P1 x N).
Y2: Input matrix \(Y_2\) (P2 x N). Default is Y1.
rank: Integer vector of decoder ranks to evaluate. Default is 1:2.
rank.encoder: Integer vector of encoder ranks to evaluate. Default is NULL, which sets rank.encoder = rank and evaluates element-wise pairs. When explicitly specified, all combinations with rank are evaluated.
...: Additional arguments passed to nmfae (e.g., epsilon, maxit). Also accepts: nfolds (number of folds, default 5; div also accepted), seed (integer seed, default 123). For backward compatibility, Q and R are accepted as aliases for rank and rank.encoder.

Value

A list with components:

objfunc: Named numeric vector of mean MSE for each (Q, R) pair.
sigma: Named numeric vector of RMSE (square root of MSE) for each pair.
objfunc.fold: Named list of per-fold MSE vectors for each pair.
folds: List of length div containing the held-out element indices for each fold.
QR: Data frame with columns Q and R listing the evaluated pairs.

Lifecycle

This function is experimental. The interface may change in future versions; details are to be described in an upcoming paper.

Examples

Y <- t(iris[1:30, 1:4])
# Default: rank.encoder=NULL -> paired rank=rank.encoder
res <- nmfae.ecv(Y, rank = 1:3, nfolds = 3, maxit = 500)
#> Element-wise CV: 3 (Q,R) pairs, 3-fold, 9 tasks...
#>   Q=1, R=1: MSE=0.021319, sigma=0.1460
#>   Q=2, R=2: MSE=0.021287, sigma=0.1459
#>   Q=3, R=3: MSE=0.021299, sigma=0.1459
res$sigma
#>   Q=1,R=1   Q=2,R=2   Q=3,R=3 
#> 0.1460097 0.1459009 0.1459406 
# Explicit rank.encoder: full grid
res2 <- nmfae.ecv(Y, rank = 1:3, rank.encoder = 1:3, nfolds = 3, maxit = 500)
#> Element-wise CV: 9 (Q,R) pairs, 3-fold, 27 tasks...
#>   Q=1, R=1: MSE=0.021319, sigma=0.1460
#>   Q=2, R=1: MSE=0.021319, sigma=0.1460
#>   Q=3, R=1: MSE=0.021319, sigma=0.1460
#>   Q=1, R=2: MSE=0.021321, sigma=0.1460
#>   Q=2, R=2: MSE=0.021287, sigma=0.1459
#>   Q=3, R=2: MSE=0.021299, sigma=0.1459
#>   Q=1, R=3: MSE=0.021322, sigma=0.1460
#>   Q=2, R=3: MSE=0.021289, sigma=0.1459
#>   Q=3, R=3: MSE=0.021299, sigma=0.1459
res2$sigma
#>   Q=1,R=1   Q=2,R=1   Q=3,R=1   Q=1,R=2   Q=2,R=2   Q=3,R=2   Q=1,R=3   Q=2,R=3 
#> 0.1460097 0.1460097 0.1460097 0.1460179 0.1459009 0.1459408 0.1460194 0.1459088 
#>   Q=3,R=3 
#> 0.1459406