Optimize NMF with kernel covariates (Full Support for Missing Values)

nmfkc fits a nonnegative matrix factorization with kernel covariates under the tri-factorization model \(Y \approx X C A = X B\).

This function supports two major input modes:

Matrix Mode (Existing): nmfkc(Y=matrix, A=matrix, ...)
Formula Mode (New): nmfkc(formula=Y_vars ~ A_vars, data=df, rank=Q, ...)

The rank of the basis matrix can be specified using either the rank argument (preferred for formula mode) or the hidden Q argument (for backward compatibility).

Usage

nmfkc(Y, A = NULL, rank = NULL, data, epsilon = 1e-04, maxit = 5000, ...)

Source

Satoh, K. (2024). Applying Non-negative Matrix Factorization with Covariates to the Longitudinal Data as Growth Curve Model. arXiv:2403.05359. https://arxiv.org/abs/2403.05359

Arguments

Y

Observation matrix, OR a formula object if data is supplied.

A

Covariate matrix. Default is NULL (no covariates).

rank

Integer. The rank of the basis matrix \(X\) (Q). Preferred over Q.

data

Optional. A data frame from which variables in the formula should be taken.

epsilon

Positive convergence tolerance.

maxit

Maximum number of iterations.

...

Additional arguments passed for fine-tuning regularization, initialization, constraints, and output control. This includes the backward-compatible arguments Q and method.

Y.weights: Optional numeric matrix (P x N) or vector (length N). 0 indicates missing/ignored values. If NULL (default), weights are automatically set to 0 for NAs in Y, and 1 otherwise.
X.L2.ortho: Nonnegative penalty parameter for the orthogonality of \(X\) (default: 0). It minimizes the off-diagonal elements of the Gram matrix \(X^\top X\), reducing the correlation between basis vectors (conceptually minimizing \(\| X^\top X - \mathrm{diag}(X^\top X) \|_F^2\)). (Formerly lambda.ortho).
B.L1: Nonnegative penalty parameter for L1 regularization on \(B = C A\) (default: 0). Promotes sparsity in the coefficients. (Formerly gamma).
C.L1: Nonnegative penalty parameter for L1 regularization on \(C\) (default: 0). Promotes sparsity in the parameter matrix. (Formerly lambda).
Q: Backward-compatible name for the rank of the basis matrix (Q).
method: Objective function: Euclidean distance "EU" (default) or Kullback–Leibler divergence "KL".
X.restriction: Constraint for columns of \(X\). Options: "colSums" (default), "colSqSums", "totalSum", or "fixed".
X.init: Method for initializing the basis matrix \(X\). Options: "kmeans" (default), "runif", "nndsvd", or a user-specified matrix.
nstart: Number of random starts for kmeans when initializing \(X\) (default: 1).
seed: Integer seed for reproducibility (default: 123).
prefix: Prefix for column names of \(X\) and row names of \(B\) (default: "Basis").
print.trace: Logical. If TRUE, prints progress every 10 iterations (default: FALSE).
print.dims: Logical. If TRUE (default), prints matrix dimensions and elapsed time.
save.time: Logical. If TRUE (default), skips some post-computations (e.g., CPCC, silhouette) to save time.
save.memory: Logical. If TRUE, performs only essential computations (implies save.time = TRUE) to reduce memory usage (default: FALSE).

Value

A list with components:

call: The matched call, as captured by match.call().
dims: A character string summarizing the matrix dimensions of the model.
runtime: A character string summarizing the computation time.
X: Basis matrix. Column normalization depends on X.restriction.
B: Coefficient matrix \(B = C A\).
XB: Fitted values for \(Y\).
C: Parameter matrix.
B.prob: Soft-clustering probabilities derived from columns of \(B\).
B.cluster: Hard-clustering labels (argmax over \(B.prob\) for each column).
X.prob: Row-wise soft-clustering probabilities derived from \(X\).
X.cluster: Hard-clustering labels (argmax over \(X.prob\) for each row).
A.attr: List of attributes of the input covariate matrix A, containing metadata like lag order and intercept status if created by nmfkc.ar or nmfkc.kernel.
objfunc: Final objective value.
objfunc.iter: Objective values by iteration.
r.squared: Coefficient of determination \(R^2\) between \(Y\) and \(X B\).
sigma: The residual standard error, representing the typical deviation of the observed values \(Y\) from the fitted values \(X B\).
criterion: A list of selection criteria, including ICp, CPCC, silhouette, AIC, and BIC.

References

Ding, C., Li, T., Peng, W., & Park, H. (2006). Orthogonal Nonnegative Matrix Tri-Factorizations for Clustering. In Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 126–135). doi:10.1145/1150402.1150420 Potthoff, R. F., & Roy, S. N. (1964). A generalized multivariate analysis of variance model useful especially for growth curve problems. Biometrika, 51, 313–326. doi:10.2307/2334137

Examples

# install.packages("remotes")
# remotes::install_github("ksatohds/nmfkc")
# Example 1. Matrix Mode (Existing)
library(nmfkc)
X <- cbind(c(1,0,1),c(0,1,0))
B <- cbind(c(1,0),c(0,1),c(1,1))
Y <- X %*% B
rownames(Y) <- paste0("P",1:nrow(Y))
colnames(Y) <- paste0("N",1:ncol(Y))
print(X); print(B); print(Y)
#>      [,1] [,2]
#> [1,]    1    0
#> [2,]    0    1
#> [3,]    1    0
#>      [,1] [,2] [,3]
#> [1,]    1    0    1
#> [2,]    0    1    1
#>    N1 N2 N3
#> P1  1  0  1
#> P2  0  1  1
#> P3  1  0  1
library(nmfkc)
res <- nmfkc(Y,Q=2,epsilon=1e-6)
#> Y(3,3)~X(3,2)B(2,3)...
#> 0sec
res$X
#>    Basis1      Basis2
#> P1      0 0.498047869
#> P2      1 0.003904261
#> P3      0 0.498047869
res$B
#>              N1           N2        N3
#> Basis1 0.000000 0.9999995176 0.9920988
#> Basis2 2.007838 0.0001206861 2.0079012

# Example 2. Formula Mode (New)
# dummy_data <- data.frame(Y1=rpois(10,5), Y2=rpois(10,10), A1=1:10, A2=rnorm(10,5))
# res_f <- nmfkc(Y1 + Y2 ~ A1 + A2, data=dummy_data, rank=2)