nmfae.ecv performs k-fold element-wise cross-validation by randomly
holding out individual elements of \(Y_1\), assigning them a weight of 0
via Y1.weights, and evaluating the reconstruction error on those
held-out elements.
This method (also known as Wold's CV) is suitable for determining the optimal
rank pair \((Q, R)\) in three-layer NMF. Both rank and rank.encoder accept
vector inputs. When rank.encoder = NULL (default), rank.encoder is set equal to rank
and pairs are evaluated element-wise (i.e., \((Q_1, R_1), (Q_2, R_2), \dots\)).
When rank.encoder is explicitly specified, all combinations of rank and rank.encoder
are evaluated via expand.grid.
Arguments
- Y1
Output matrix \(Y_1\) (P1 x N).
- Y2
Input matrix \(Y_2\) (P2 x N). Default is
Y1.- rank
Integer vector of decoder ranks to evaluate. Default is
1:2.- rank.encoder
Integer vector of encoder ranks to evaluate. Default is
NULL, which setsrank.encoder = rankand evaluates element-wise pairs. When explicitly specified, all combinations withrankare evaluated.- ...
Additional arguments passed to
nmfae(e.g.,epsilon,maxit). Also accepts:nfolds(number of folds, default 5;divalso accepted),seed(integer seed, default 123). For backward compatibility,QandRare accepted as aliases forrankandrank.encoder.
Value
A list with components:
- objfunc
Named numeric vector of mean MSE for each (Q, R) pair.
- sigma
Named numeric vector of RMSE (square root of MSE) for each pair.
- objfunc.fold
Named list of per-fold MSE vectors for each pair.
- folds
List of length
divcontaining the held-out element indices for each fold.- QR
Data frame with columns
QandRlisting the evaluated pairs.
Examples
Y <- t(iris[1:30, 1:4])
# Default: rank.encoder=NULL -> paired rank=rank.encoder
res <- nmfae.ecv(Y, rank = 1:3, nfolds = 3, maxit = 500)
#> Element-wise CV: 3 (Q,R) pairs, 3-fold, 9 tasks...
#> Q=1, R=1: MSE=0.021319, sigma=0.1460
#> Q=2, R=2: MSE=0.021287, sigma=0.1459
#> Q=3, R=3: MSE=0.021299, sigma=0.1459
res$sigma
#> Q=1,R=1 Q=2,R=2 Q=3,R=3
#> 0.1460097 0.1459009 0.1459406
# Explicit rank.encoder: full grid
res2 <- nmfae.ecv(Y, rank = 1:3, rank.encoder = 1:3, nfolds = 3, maxit = 500)
#> Element-wise CV: 9 (Q,R) pairs, 3-fold, 27 tasks...
#> Q=1, R=1: MSE=0.021319, sigma=0.1460
#> Q=2, R=1: MSE=0.021319, sigma=0.1460
#> Q=3, R=1: MSE=0.021319, sigma=0.1460
#> Q=1, R=2: MSE=0.021321, sigma=0.1460
#> Q=2, R=2: MSE=0.021287, sigma=0.1459
#> Q=3, R=2: MSE=0.021299, sigma=0.1459
#> Q=1, R=3: MSE=0.021322, sigma=0.1460
#> Q=2, R=3: MSE=0.021289, sigma=0.1459
#> Q=3, R=3: MSE=0.021299, sigma=0.1459
res2$sigma
#> Q=1,R=1 Q=2,R=1 Q=3,R=1 Q=1,R=2 Q=2,R=2 Q=3,R=2 Q=1,R=3 Q=2,R=3
#> 0.1460097 0.1460097 0.1460097 0.1460179 0.1459009 0.1459408 0.1460194 0.1459088
#> Q=3,R=3
#> 0.1459406