Owen & Perry's (2009) bi-cross-validation (BCV) for choosing the NMF
rank. A lightweight CV engine in the spirit of nmfkc.ecv:
it returns the held-out error per rank and nothing more (no plot, no
table) – pass the result to which.min(sigma), or build your own
diagnostics.
Unlike the element-wise CV of nmfkc.ecv (which holds out
scattered entries and refits with weights), BCV holds out a whole
row-block and column-block simultaneously: the model is fit only
on the retained block \(D\), and the held-out block \(A\) is predicted
by folding the held-out rows/columns onto the fixed \(D\)-factors via
non-negative regression (\(\hat A = L_I R_J\)). Because the held-out
rows and columns never enter the fit, there is no information leakage.
Covariates are ignored (plain NMF). The recommended setting is to leave
out roughly half the rows and half the columns (nfolds = 2).
Arguments
- Y
Observation matrix (\(P \times N\)), non-negative.
- rank
Integer vector of ranks to evaluate.
- ...
Advanced options, rarely needed (defaults in parentheses):
nfolds(2), the number of row and column folds (the grid isnfolds x nfolds;2leaves out half the rows / columns, Owen & Perry's recommendation);seed(123, fold-assignment seed); andnnls.maxit(100, multiplicative-update iterations for the fold-in non-negative regressions). Any other arguments are passed tonmfkcfor the per-block fits (e.g.\maxit).
Value
A list (cf.\ nmfkc.ecv) with:
- objfunc
Held-out mean squared error for each rank.
- sigma
Its square root (RMSE) for each rank.
- rank
The evaluated rank vector.
- nfolds
The number of folds used.
Details
Each fold keeps about \((1 - 1/\text{nfolds})\) of the rows
and columns, so the retained block \(D\) must have more than
rank rows and columns. The largest testable rank is
therefore about \((1 - 1/\text{nfolds})\min(P, N) - 1\); with
nfolds = 2 this is roughly \(\min(P, N)/2 - 1\). Ranks above
this return NA and trigger a warning that
names the limit and the nfolds (or nmfkc.ecv)
that would reach the requested ranks. Raising nfolds lifts the
limit at the cost of a smaller hold-out and more compute
(\((\text{nfolds} - 1)^2\) full fits per rank).
References
A. B. Owen and P. O. Perry (2009). Bi-cross-validation of the SVD and the nonnegative matrix factorization. Ann. Appl. Stat. 3(2):564–594. doi:10.1214/08-AOAS227 .
Examples
# \donttest{
## rank-3 non-negative data; bi-CV needs enough kept rows/cols per
## fold (> rank), so use a matrix with ample dimensions.
set.seed(1)
X <- matrix(abs(rnorm(30 * 3)), 30, 3)
B <- matrix(abs(rnorm(3 * 40)), 3, 40)
bv <- nmfkc.bicv(X %*% B, rank = 1:6) # nfolds = 2 (Owen & Perry) by default
#> bi-CV: ranks 1,2,3,4,5,6, 2x2 fold grid (Owen-Perry 2009)...
bv$sigma # held-out RMSE per rank
#> rank=1 rank=2 rank=3 rank=4 rank=5 rank=6
#> 0.44454653 0.31984467 0.09025809 0.06145124 0.05091540 0.03910144
bv$rank[which.min(bv$sigma)]
#> [1] 6
# }