Skip to contents

Performs K-fold cross-validation to evaluate the equilibrium mapping of the NMF-FFB model.

For each fold, nmf.sem is fitted on the training samples, yielding an equilibrium mapping \(\hat Y_1 = M_{\mathrm{model}} Y_2\). The held-out endogenous variables \(Y_1\) are then predicted from \(Y_2\) using this mapping, and the mean absolute error (MAE) over all entries in the test block is computed. The returned value is the average MAE across folds.

This implements the hyperparameter selection strategy described in the paper: hyperparameters are chosen by predictive cross-validation rather than direct inspection of the internal structural matrices.

Usage

nmf.ffb.cv(...)

nmf.sem.cv(
  Y1,
  Y2,
  rank = NULL,
  X.init = "nndsvd",
  X.L2.ortho = 100,
  C1.L1 = 1,
  C2.L1 = 0.1,
  epsilon = 1e-06,
  maxit = 5000,
  ...
)

Arguments

...

Additional arguments passed to nmf.sem (except for rank, seed, div, shuffle, which are handled here). Also accepts: nfolds (number of folds, default 5; div also accepted), seed (master random seed, default NULL), shuffle (logical, default TRUE).

Y1

A non-negative numeric matrix of endogenous variables with rows = variables (P1), columns = samples (N).

Y2

A non-negative numeric matrix of exogenous variables with rows = variables (P2), columns = samples (N). Must satisfy ncol(Y1) == ncol(Y2).

rank

Integer; rank (number of latent factors) passed to nmf.sem. If NULL, nmf.sem decides the effective rank (via ... or nrow(Y2)).

X.init

Initialization strategy for X, forwarded to nmf.sem. One of "nndsvd" (default), "kmeans", "kmeansar", "runif", a numeric \(P_1 \times Q\) matrix, or NULL (alias for "nndsvd"). See nmf.sem for details.

X.L2.ortho

L2 orthogonality penalty for X.

C1.L1

L1 sparsity penalty for C1 (\(\Theta_1\)).

C2.L1

L1 sparsity penalty for C2 (\(\Theta_2\)).

epsilon

Convergence threshold for nmf.sem.

maxit

Maximum number of iterations for nmf.sem.

Value

A numeric scalar: mean MAE across CV folds.

See also

Examples

Y <- t(iris[, -5])
Y1 <- Y[1:2, ]
Y2 <- Y[3:4, ]
mae <- nmf.sem.cv(Y1, Y2, rank = 2, maxit = 500, nfolds = 3)
#> Warning: maximum iterations (500) reached...
#> Warning: maximum iterations (500) reached...
#> Warning: maximum iterations (500) reached...
mae
#> [1] 1.648133