Performs K-fold cross-validation to evaluate the equilibrium mapping of the NMF-SEM model.

For each fold, nmf.sem is fitted on the training samples, yielding an equilibrium mapping \(\hat Y_1 = M_{\mathrm{model}} Y_2\). The held-out endogenous variables \(Y_1\) are then predicted from \(Y_2\) using this mapping, and the mean absolute error (MAE) over all entries in the test block is computed. The returned value is the average MAE across folds.

This implements the hyperparameter selection strategy described in the paper: hyperparameters are chosen by predictive cross-validation rather than direct inspection of the internal structural matrices.

nmf.sem.cv(
  Y1,
  Y2,
  rank = NULL,
  X.init = NULL,
  X.L2.ortho = 100,
  C1.L1 = 0.5,
  C2.L1 = 0,
  epsilon = 1e-04,
  maxit = 50000,
  seed = NULL,
  div = 5,
  shuffle = TRUE,
  ...
)

Arguments

Y1

A non-negative numeric matrix of endogenous variables with rows = variables (P1), columns = samples (N).

Y2

A non-negative numeric matrix of exogenous variables with rows = variables (P2), columns = samples (N). Must satisfy ncol(Y1) == ncol(Y2).

rank

Integer; rank (number of latent factors) passed to nmf.sem. If NULL, nmf.sem decides the effective rank (via ... or nrow(Y2)).

X.init

Optional initialization for X (as in nmf.sem).

X.L2.ortho

L2 orthogonality penalty for X.

C1.L1

L1 sparsity penalty for C1 (\(\Theta_1\)).

C2.L1

L1 sparsity penalty for C2 (\(\Theta_2\)).

epsilon

Convergence threshold for nmf.sem.

maxit

Maximum number of iterations for nmf.sem.

seed

Master random seed for CV splitting and fold-specific calls to nmf.sem. If NULL, RNG is not controlled within folds.

div

Number of CV folds. (Default: 5)

shuffle

Logical; if TRUE, samples are randomly permuted before assigning to folds. (Default: TRUE)

...

Additional arguments passed to nmf.sem (except for rank, seed, div, shuffle, which are handled here).

Value

A numeric scalar: mean MAE across CV folds.