Skip to contents

Solves $$Y \approx X\,\Theta\,A,\qquad X \ge 0,\;\Theta\in\R^{Q\times D}, \;A\in\R^{D\times N},$$ where the covariate matrix \(A\) and the coefficient matrix \(\Theta\) may be signed. Internally \(A = A_{+} - A_{-}\) and \(\Theta = C_{+} - C_{-}\) with \(A_{\pm}, C_{\pm} \ge 0\) (sign-splitting trick, Ding et al. 2010), and the problem is solved by a Direct Multiplicative Update algorithm whose iteration cost is \(O(Q D^2)\), independent of \(N\).

Only \(X\) is structurally constrained to be non-negative (Semi-NMF sense of Ding, Li, & Jordan 2010). In particular, \(Y\) may contain negative entries, in which case the response is fit in the least-squares sense without any non-negativity requirement on \(Y\).

When \(A \ge 0\) (so \(A_{-} = 0\)), the result reduces to nmfkc(Y, A, rank) with Euclidean loss, up to reordering.

Usage

nmfkc.signed(
  Y,
  A,
  rank = NULL,
  epsilon = 1e-04,
  maxit = 5000,
  verbose = TRUE,
  ...
)

Arguments

Y

Real-valued \(Q_{\mathrm{obs}} \times N\) response matrix. Unlike nmfkc, negative entries are allowed.

A

Real-valued \(D \times N\) covariate matrix (signed). A single matrix is passed; its positive and negative parts \(A_{+} = \max(A, 0)\) and \(A_{-} = \max(-A, 0)\) are computed internally. When using Random Fourier Features (Rahimi & Recht 2007) as \(A\), supply the RFF parameters via the hidden pars argument so that predict() can regenerate features for new data (see pars entry in ... below).

rank

Integer. Number of latent components \(Q\) in \(X\).

epsilon

Relative convergence tolerance on the objective (default 1e-4).

maxit

Maximum number of iterations (default 5000).

verbose

Logical. Print dimensions at start (default TRUE).

...

Additional arguments:

  • Q: alias for rank.

  • X.restriction: constraint applied to columns of \(X\) after every update, with the scale absorbed into \(C_{+}, C_{-}\). One of "colSums" (default, \(\mathrm{colSums}(X) = 1\)), "colSqSums", "totalSum", "none", "fixed".

  • X.init: initialization strategy for the basis matrix \(X\) (\(Q_{\mathrm{obs}} \times Q\)). Accepts the same menu as nmfkc: "kmeans" (default), "kmeansar", "nndsvd", "runif", or a user-supplied \(Q_{\mathrm{obs}} \times Q\) non-negative numeric matrix. String methods delegate to the shared internal helper .init_X_method() (see nmfkc for the definitions of each method). For signed \(Y\), "kmeans" cluster centers may contain negative entries; they are clipped to zero to satisfy \(X \ge 0\), and any column that collapses to all-zeros is re-filled with small \(\mathrm{Uniform}(0, 0.1)\) noise.

  • C.init: explicit initial \(Q \times D\) coefficient matrix \(\Theta\) (signed). Split internally.

  • warm.start: logical (default TRUE). If TRUE and \(Y \ge 0\), runs nmfkc(Y, A = rbind(A_+, A_-), rank = Q) internally to seed \(X, C_{+}, C_{-}\). The user's X.init, seed, nstart, and X.restriction are forwarded to the internal nmfkc call so that initialization choices propagate consistently between the warm-start and the signed MU loop. Ignored when \(Y\) has negative entries (warm-start is disabled; X.init is used directly by the signed branch instead).

  • seed: RNG seed for random initialization (default 123).

  • prefix: name prefix for rows of \(C\) and columns of \(X\) (default "Basis").

  • pars: optional list list(omega, b, D, beta) of Random Fourier Feature parameters (Rahimi & Recht 2007; omega: frequency matrix, b: phase offset, D: feature dimension, beta: bandwidth). When supplied, it is stored in the returned object so that summary() can report \(\beta\) and downstream predict() calls can regenerate RFF features for new data. If A is not RFF features, leave this NULL.

  • Y.weights: Optional non-negative weight matrix (\(Q_{\mathrm{obs}} \times N\)) or vector (length \(N\)), analogous to the weights argument of lm. Loss becomes \(\sum W_{ij} \, (Y_{ij} - (XCA)_{ij})^2\) (lm()-style, linear in \(W\)). Logical matrices (TRUE / FALSE) are also accepted. Typical usage by nmfkc.signed.cv / nmfkc.signed.ecv passes a binary mask \(W \in \{0,1\}\) to hold out test elements; real-valued weights for observation-level importance weighting are also supported. Default NULL: if Y has NA, a binary mask is auto-constructed (0 for NA, 1 elsewhere); otherwise no weighting.

  • nstart: number of random restarts. Signed models have more local minima than non-negative ones because \(\Theta = C_{+} - C_{-}\) can take both positive and negative values. Since nmfkc.signed() itself does not loop over restarts (callers control it), set the outer-loop size via e.g. running the function several times with different seed and keeping the fit with the smallest $objfunc. A restart budget of 10-50 is recommended for publication-grade runs on signed data.

Value

An object of class c("nmfkc.signed", "nmfkc") with

  • X: \(Q_{\mathrm{obs}} \times Q\) basis matrix (non-negative, column-normalized according to X.restriction).

  • Cp, Cn: \(Q \times D\) non-negative parts of \(\Theta\), so that \(\Theta = C_{+} - C_{-}\).

  • C: \(C_{+} - C_{-}\) (= \(\Theta\)), signed.

  • B: \(C \, A\), \(Q \times N\) (signed).

  • objfunc.iter: objective values per iteration.

  • objfunc: final objective.

  • r.squared: \(\mathrm{cor}(Y, \widehat Y)^2\).

  • mae: mean absolute error.

  • iter: number of iterations performed.

  • runtime: elapsed seconds.

  • Y.signed: logical; whether \(Y\) contained negative entries during fitting.

  • pars: RFF generating parameters, if supplied.

  • call: the matched call.

Lifecycle

This function is experimental. The interface may change in future versions.

References

Ding, C. H. Q., Li, T., & Jordan, M. I. (2010). Convex and semi-nonnegative matrix factorizations. IEEE TPAMI, 32(1), 45–55.

Rahimi, A., & Recht, B. (2007). Random features for large-scale kernel machines. Advances in NIPS, 20.

Examples

# \donttest{
set.seed(1)
## Example 1: signed A (e.g., hand-built RFF features), non-negative Y
## Build simple signed features Z = sqrt(2/D) * cos(omega^T U + b):
U     <- matrix(stats::rnorm(5 * 40), 5, 40)           # raw input
D     <- 20                                            # feature dim
omega <- matrix(stats::rnorm(5 * D), 5, D)             # random freqs
b     <- stats::runif(D, 0, 2 * pi)                    # phase
Z     <- sqrt(2 / D) *
           cos(t(omega) %*% U + matrix(b, D, 40))      # D x 40, signed
Y     <- matrix(abs(stats::rnorm(8 * 40)), 8, 40)
res1  <- nmfkc.signed(Y, A = Z, rank = 3, maxit = 200)
#> Y(8,40) ~ X(8,3) %*% C(3,20) %*% A(20,40)  [signed covariate]

## Example 2: signed Y (regression)
Y2    <- matrix(stats::rnorm(8 * 40), 8, 40)           # signed response
res2  <- nmfkc.signed(Y2, A = Z, rank = 3, maxit = 200,
                       warm.start = FALSE)
#> Y(8,40) ~ X(8,3) %*% C(3,20) %*% A(20,40)  [signed covariate], Y signed
# }