Solves $$Y \approx X\,\Theta\,A,\qquad X \ge 0,\;\Theta\in\R^{Q\times D}, \;A\in\R^{D\times N},$$ where the covariate matrix \(A\) and the coefficient matrix \(\Theta\) may be signed. Internally \(A = A_{+} - A_{-}\) and \(\Theta = C_{+} - C_{-}\) with \(A_{\pm}, C_{\pm} \ge 0\) (sign-splitting trick, Ding et al. 2010), and the problem is solved by a Direct Multiplicative Update algorithm whose iteration cost is \(O(Q D^2)\), independent of \(N\).
Only \(X\) is structurally constrained to be non-negative (Semi-NMF sense of Ding, Li, & Jordan 2010). In particular, \(Y\) may contain negative entries, in which case the response is fit in the least-squares sense without any non-negativity requirement on \(Y\).
When \(A \ge 0\) (so \(A_{-} = 0\)), the result reduces to
nmfkc(Y, A, rank) with Euclidean loss, up to reordering.
Arguments
- Y
Real-valued \(Q_{\mathrm{obs}} \times N\) response matrix. Unlike
nmfkc, negative entries are allowed.- A
Real-valued \(D \times N\) covariate matrix (signed). A single matrix is passed; its positive and negative parts \(A_{+} = \max(A, 0)\) and \(A_{-} = \max(-A, 0)\) are computed internally. When using Random Fourier Features (Rahimi & Recht 2007) as \(A\), supply the RFF parameters via the hidden
parsargument so thatpredict()can regenerate features for new data (seeparsentry in...below).- rank
Integer. Number of latent components \(Q\) in \(X\).
- epsilon
Relative convergence tolerance on the objective (default
1e-4).- maxit
Maximum number of iterations (default
5000).- verbose
Logical. Print dimensions at start (default
TRUE).- ...
Additional arguments:
Q: alias forrank.X.restriction: constraint applied to columns of \(X\) after every update, with the scale absorbed into \(C_{+}, C_{-}\). One of"colSums"(default, \(\mathrm{colSums}(X) = 1\)),"colSqSums","totalSum","none","fixed".X.init: initialization strategy for the basis matrix \(X\) (\(Q_{\mathrm{obs}} \times Q\)). Accepts the same menu asnmfkc:"kmeans"(default),"kmeansar","nndsvd","runif", or a user-supplied \(Q_{\mathrm{obs}} \times Q\) non-negative numeric matrix. String methods delegate to the shared internal helper.init_X_method()(seenmfkcfor the definitions of each method). For signed \(Y\),"kmeans"cluster centers may contain negative entries; they are clipped to zero to satisfy \(X \ge 0\), and any column that collapses to all-zeros is re-filled with small \(\mathrm{Uniform}(0, 0.1)\) noise.C.init: explicit initial \(Q \times D\) coefficient matrix \(\Theta\) (signed). Split internally.warm.start: logical (defaultTRUE). IfTRUEand \(Y \ge 0\), runsnmfkc(Y, A = rbind(A_+, A_-), rank = Q)internally to seed \(X, C_{+}, C_{-}\). The user'sX.init,seed,nstart, andX.restrictionare forwarded to the internalnmfkccall so that initialization choices propagate consistently between the warm-start and the signed MU loop. Ignored when \(Y\) has negative entries (warm-start is disabled;X.initis used directly by the signed branch instead).seed: RNG seed for random initialization (default 123).prefix: name prefix for rows of \(C\) and columns of \(X\) (default"Basis").pars: optional listlist(omega, b, D, beta)of Random Fourier Feature parameters (Rahimi & Recht 2007;omega: frequency matrix,b: phase offset,D: feature dimension,beta: bandwidth). When supplied, it is stored in the returned object so thatsummary()can report \(\beta\) and downstreampredict()calls can regenerate RFF features for new data. IfAis not RFF features, leave thisNULL.Y.weights: Optional non-negative weight matrix (\(Q_{\mathrm{obs}} \times N\)) or vector (length \(N\)), analogous to theweightsargument oflm. Loss becomes \(\sum W_{ij} \, (Y_{ij} - (XCA)_{ij})^2\) (lm()-style, linear in \(W\)). Logical matrices (TRUE/FALSE) are also accepted. Typical usage bynmfkc.signed.cv/nmfkc.signed.ecvpasses a binary mask \(W \in \{0,1\}\) to hold out test elements; real-valued weights for observation-level importance weighting are also supported. DefaultNULL: ifYhasNA, a binary mask is auto-constructed (0 forNA, 1 elsewhere); otherwise no weighting.nstart: number of random restarts. Signed models have more local minima than non-negative ones because \(\Theta = C_{+} - C_{-}\) can take both positive and negative values. Sincenmfkc.signed()itself does not loop over restarts (callers control it), set the outer-loop size via e.g. running the function several times with differentseedand keeping the fit with the smallest$objfunc. A restart budget of 10-50 is recommended for publication-grade runs on signed data.
Value
An object of class c("nmfkc.signed", "nmfkc") with
X: \(Q_{\mathrm{obs}} \times Q\) basis matrix (non-negative, column-normalized according toX.restriction).Cp,Cn: \(Q \times D\) non-negative parts of \(\Theta\), so that \(\Theta = C_{+} - C_{-}\).C: \(C_{+} - C_{-}\) (= \(\Theta\)), signed.B: \(C \, A\), \(Q \times N\) (signed).objfunc.iter: objective values per iteration.objfunc: final objective.r.squared: \(\mathrm{cor}(Y, \widehat Y)^2\).mae: mean absolute error.iter: number of iterations performed.runtime: elapsed seconds.Y.signed: logical; whether \(Y\) contained negative entries during fitting.pars: RFF generating parameters, if supplied.call: the matched call.
References
Ding, C. H. Q., Li, T., & Jordan, M. I. (2010). Convex and semi-nonnegative matrix factorizations. IEEE TPAMI, 32(1), 45–55.
Rahimi, A., & Recht, B. (2007). Random features for large-scale kernel machines. Advances in NIPS, 20.
Examples
# \donttest{
set.seed(1)
## Example 1: signed A (e.g., hand-built RFF features), non-negative Y
## Build simple signed features Z = sqrt(2/D) * cos(omega^T U + b):
U <- matrix(stats::rnorm(5 * 40), 5, 40) # raw input
D <- 20 # feature dim
omega <- matrix(stats::rnorm(5 * D), 5, D) # random freqs
b <- stats::runif(D, 0, 2 * pi) # phase
Z <- sqrt(2 / D) *
cos(t(omega) %*% U + matrix(b, D, 40)) # D x 40, signed
Y <- matrix(abs(stats::rnorm(8 * 40)), 8, 40)
res1 <- nmfkc.signed(Y, A = Z, rank = 3, maxit = 200)
#> Y(8,40) ~ X(8,3) %*% C(3,20) %*% A(20,40) [signed covariate]
## Example 2: signed Y (regression)
Y2 <- matrix(stats::rnorm(8 * 40), 8, 40) # signed response
res2 <- nmfkc.signed(Y2, A = Z, rank = 3, maxit = 200,
warm.start = FALSE)
#> Y(8,40) ~ X(8,3) %*% C(3,20) %*% A(20,40) [signed covariate], Y signed
# }