Skip to contents

nmfkc 0.7.2

Headline: NMF-FFB rebrand and full bootstrap inference

  • nmf.ffb* family added as the canonical alias for nmf.sem* (Satoh 2025, arXiv:2512.18250 adopts “NMF-FFB” — Non-negative Matrix Factorization with Feed-Forward + Feedback — as the model’s canonical name). nmf.sem* continues to work and shares the same return classes (c("nmf.ffb", "nmf.sem") and c("nmf.ffb.inference", "nmf.sem.inference", ...)), so existing scripts are unaffected.
  • nmf.sem.inference() / nmf.ffb.inference(): replaced the legacy 1-step Newton wild bootstrap with a full X-fixed pair bootstrap. Resamples columns of (Y1, Y2), refits (C1, C2) with X held at the original fit, and reports per-element support_rate = mean(|c_b| > threshold) together with percentile CIs. Significance markers (* / ** / *** at sup > 0.95 / 0.99 / 0.999) follow the lavaan convention. Both Theta_1 (feedback) and Theta_2 (exogenous) are inference targets (previous version covered only Theta_2).
  • nmf.sem() / nmf.ffb(): now runs nmfkc(Y1, A = Y2) internally by default when X.init is a string method, forwarding X.init, X.L2.ortho, epsilon, maxit, seed. The feedforward fit is used both as the X warm-start and as the baseline for SC.map. nmfkc.baseline = FALSE opts out.

Bug Fixes

  • nmf.sem.inference(): fixed dimension bug in the Leontief identity matrix (I_mat <- diag(Q) should have been diag(P1)); previously every replicate was silently marked invalid when P1 != Q.
  • nmfkc.net(): now auto-masks NA entries of Y (parity with the other four NMF variants); previously errored at the min(Y) < 0 check when Y contained NA.
  • nmfkc(): Fixed C matrix asymmetry in tri-symmetric NMF (Y.symmetric = "tri"). The C update was using stale B and XB computed from the old X; now B and XB are recomputed after X is updated. Also fixed column reordering to permute both rows and columns of C. Previously the relative asymmetry could reach ~46%; now it is at machine precision (~1e-14).

Improvements

  • Y.weights semantics unified to lm()-style weighted least squares across nmfkc(), nmfae(), nmfkc.net(), nmfkc.signed(), nmfae.signed(): loss is now sum(W * (Y - Yhat)^2) (linear in W, matching lm()’s weights argument). Binary masks (W ∈ {0, 1}; the standard ECV / NA-mask case) are unaffected since W = W^2.
  • All MU functions now emit a "maximum iterations (N) reached..." warning when maxit is exhausted without meeting the relative- tolerance criterion (previously silent in nmfae, nmfae.signed, nmfkc.net, nmfkc.signed, nmfre, and nmf.sem).
  • All MU functions now share maxit = 5000 as the default (was 5000 / 20000 / 50000 inconsistently). Together with the maxit warning above, users see explicit feedback when 5000 is insufficient and can opt into a larger cap.
  • New shared internal helper .init_X_method() for X initialization via "nndsvd" / "kmeans" / "kmeansar" / "runif" / numeric matrix. All NMF families now use the same dispatch logic; previous ad-hoc inline implementations are removed.
  • nmf.sem() returns SC.map (input-output structural fidelity: correlation between the equilibrium operator and the feedforward baseline mapping; Satoh 2025 §4.SC.map) automatically when nmfkc.baseline is supplied or computed internally.
  • summary.nmf.sem(): rewritten to display the full-bootstrap inference output — separate Theta_1 / Theta_2 blocks with Estimate | CI_low | CI_high | support | Pr(>0) | sig, plus a bootstrap meta-info header.
  • coef.nmf.sem(): now returns a long-format data frame with rows for every entry of both C1 and C2 (Type | Basis | Covariate | Estimate); previously returned only the C2 matrix when no inference had been run. Schema matches the inference-augmented output for uniformity.
  • plot.nmf.sem(): default trace is now objfunc.full (loss + penalties — the actual monotonically-decreasing quantity that the multiplicative updates minimize) instead of objfunc (reconstruction only). New argument which = "full" | "reconstruction" | "both".
  • nmf.sem.DOT(): significance stars now appear on Theta_1 (feedback Y1 → F) edges in addition to Theta_2 (exogenous Y2 → F); X (F → Y1) edges remain unstarred since the basis is not the inference target.
  • plot.nmfae.ecv(): Heatmap cell text color is now always black for better readability on light-colored cells.
  • nmfkc(): X.init = "runif" now supports nstart > 1 for multi-start initialization. Multiple random starting points are evaluated with 10 standard NMF iterations, and the best (lowest Frobenius error) is selected.
  • nmfae(), nmfre(): r.squared is now computed as cor(Y, fitted)^2 (squared correlation between observed and fitted values), consistent with nmfkc(). Previously nmfae() used 1 - SS_res/SS_tot and nmfre() used the same regression-style R-squared, which can behave unexpectedly for intercept-free non-negative models.
  • nmfkc.kernel.beta.nearest.med(): added a candidates argument controlling the bandwidth grid. Options: "7points" (new default, t = {-1,-2/3,-1/3,0,1/3,2/3,1}), "4points" (t = {-1/2, 0, 1/2, 1}), or a user-supplied numeric vector of values. Previously the grid silently differed between the no-landmark (Uk = NULL; 4 points) and landmark (7 points) branches.

New Functions (Signed NMF family)

  • nmfkc.signed(): NMF-KC with signed covariate/coefficient. Model with , (signed), real-valued. Uses Ding et al. (2010) sign-splitting + Direct MU; may also contain negative entries (semi-NMF regression). Supports Y.weights for element-wise masking.
  • nmfkc.signed.cv(), nmfkc.signed.ecv(): column-wise and element-wise k-fold CV for rank selection on signed data.
  • nmfae.signed(): Three-layer autoencoder with . preserve soft clustering on both decoder and encoder sides while the bottleneck can carry negative weights (e.g., anti-correlated properties). Hybrid warm-start (from nmfae()) + Direct MU with multi-restart.
  • nmfae.signed.ecv(): element-wise CV for (decoder-rank, encoder-rank) selection.
  • nmfae.signed.inference(): sandwich SE + wild bootstrap for (no non-negativity projection on since it is signed).
  • S3 methods predict.*.signed(), plot.*.signed(), summary.*.signed(), and nmfae.signed.rename() helper.

New Functions (Network NMF family)

  • nmfkc.net(): Single unified entry point for symmetric NMF of network data, with type = "tri" | "bi" | "signed". All three variants use the Frobenius-full bilateral gradient (supersedes the one-sided approximation in nmfkc(Y.symmetric = ...)). type = "signed" supports signed via Ding et al. (2010) sign-splitting, preserving for soft clustering while allowing inter-cluster repulsion. The returned object’s fields are uniform across types: and are for tri/bi, and populated matrices for signed. is always populated (identity for bi, non-negative for tri, signed for signed).
  • nmfkc.net.ecv(): Element-wise cross-validation with upper-triangle folds (mirrored to the lower triangle to prevent symmetry leakage). Unified entry point for type = "tri" | "bi" | "signed" (calls nmfkc.net() with the matching type for each fold).
  • nmfkc.net.DOT(): Graphviz DOT visualization for symmetric NMF networks. Displays basis-to-node membership edges and inter-basis interaction edges (C matrix) with significance stars. Now has signed parameter (auto-detected from class) to render negative C entries as dashed edges.
  • nmfkc.net.inference(): Statistical inference for symmetric NMF. Wrapper around nmfkc.inference() with A = t(X). Returns off-diagonal C coefficients with sandwich SE and wild bootstrap.

Deprecations

  • nmfkc(Y, Y.symmetric = "bi"|"tri"): Deprecated in favor of nmfkc.net(Y, type = "bi"|"tri"). The old implementation uses a one-sided gradient approximation that empirically converges for but is theoretically incorrect and does not extend to signed . The deprecated branch still works in v0.6.8 (with a deprecation warning) and will be removed in a future release.

Parameter Renames (old names remain usable for backward compatibility)

  • nmf.sem.DOT(): weight_scale_y2fweight_scale_c2, weight_scale_fy1weight_scale_x1 (matrix-name-based naming, consistent with nmfae.DOT() and nmfkc.DOT()).
  • nmf.sem.DOT(): sig.level moved to after threshold for consistency with other .DOT functions.

Documentation

  • README, vignettes, and roxygen @title / @description updated to use NMF-FFB as the canonical model name (with “(formerly NMF-SEM)” attached on first mention for discoverability of the legacy term). File names (R/nmf.sem.R, vignettes/nmf-sem-with- nmfkc.Rmd, man/nmf.sem.Rd), function names (nmf.sem*), and S3 classes ("nmf.sem") are unchanged so URLs and existing scripts continue to work.

nmfkc 0.6.7

CRAN release: 2026-04-15

Bug Fixes

Naming Unification (old names remain usable for backward compatibility)

  • Coefficient tables: all inference functions now use Basis / Covariate columns (was Factor/Exogenous in nmf.sem.inference(), Decoder/Encoder in nmfae.inference()).
  • Wild bootstrap defaults unified: wild.B = 500, wild.seed = 123 across all inference functions.
  • First argument of all .DOT functions renamed to result for consistency.
  • CV tuning parameters (nfolds, seed, shuffle) moved to ... in nmfkc.ecv(), nmfae.ecv(), nmfae.cv(), nmf.sem.cv(); div also accepted for backward compatibility.

nmfkc 0.6.6

New Functions

  • nmfkc.criterion(): Extracted criterion computation from nmfkc() as a standalone exported function. Supports detail = "full" / "fast" / "minimal" to control computation cost.
  • nmfre.inference(): Separated statistical inference from nmfre() optimization. Returns coefficient table with SE, z-values, and p-values via wild bootstrap.
  • nmf.sem.inference(): Statistical inference for the C2 parameter matrix in NMF-SEM. Uses sandwich SE and wild bootstrap.
  • S3 methods coef(), fitted(), residuals() for all model classes (nmfkc, nmfae, nmfre, nmf.sem).
  • S3 methods plot() for nmfre and nmf.sem (convergence diagnostics).
  • summary.nmf.sem(): Stability diagnostics, fit statistics, and C2 coefficient table.

Parameter Renames (old names remain usable for backward compatibility)

Other Improvements

  • hide.isolated option added to all .DOT functions (default TRUE).
  • nmf.sem.DOT(): Added sig.level parameter; C2 edges decorated with significance stars.
  • nmfkc(): Added X.restriction = "none" option and X.init = "kmeansar" initialization.
  • Added arXiv/DOI references to roxygen documentation for all main functions.
  • @section Lifecycle: Experimental added to nmfae().
  • Removed mc.cores parallel option from nmfae.ecv() for CRAN compliance.

nmfkc 0.6.0

Bug Fixes

  • Fixed variable T shadowing TRUE in information criterion computation.
  • Fixed nmfkc.ecv() to use KL divergence for evaluation when method="KL".
  • Added performance flags (save.time=TRUE) to nmfkc.ecv() inner calls.
  • Fixed zero-division in nmfkc.rank() elbow normalization when R-squared values are identical.
  • Fixed parameter name mismatch (rankQ) in nmfkc.rank() call to nmfkc.ecv().
  • Fixed descending loop in nmf.sem.split() when P=2.
  • Added input validation for n.exogenous in nmf.sem.split().

Documentation

Code Quality

  • Replaced T/F with TRUE/FALSE.
  • Replaced 1:length() with seq_along().
  • Changed default font from Meiryo to Arial in DOT functions.
  • Aligned nmf.sem.cv() defaults with nmf.sem().

nmfkc 0.5.8

Graphviz DOT Output Consolidation and Cleanup

  • Harmonized all DOT-generating functions (nmf.sem.DOT, nmfkc.DOT, nmfkc.ar.DOT) for consistent structure, naming conventions, and visualization logic.

  • Standardized node and edge formatting rules, including unified cluster behavior, color schemes, and edge-scaling conventions.

  • Implemented threshold-aware coefficient labeling so that displayed numerical precision aligns with the visualization threshold, preventing misleadingly detailed labels.

  • Removed unused or redundant DOT fragments and improved compatibility across Graphviz engines.

  • Enhanced layout readability through consistent indentation, node grouping, and suppression of isolated nodes in specific visualization modes (e.g., type = "YA" in nmfkc.DOT).

  • Refactored and expanded internal DOT helper functions (.nmfkc_dot_format_coef, .nmfkc_dot_digits_from_threshold, .nmfkc_dot_cluster_nodes, etc.) for better maintainability and uniform behavior.

  • New Function: Implemented nmfkc.ecv() for Element-wise Cross-Validation (Wold’s CV).

    • This function randomly masks elements of the observation matrix to evaluate structural reconstruction error.
    • It provides a statistically robust criterion for rank selection, avoiding the monotonic error decrease often seen in standard column-wise CV.
    • Supports vector input for rank to evaluate multiple ranks simultaneously.
  • Missing Value & Weight Support:

    • nmfkc() and nmfkc.cv() now fully support missing values (NA) and observation weights via the hidden argument Y.weights (passed through ...).
    • If Y contains NAs, they are automatically detected and masked (assigned a weight of 0) during optimization.
  • Rank Selection Diagnostics (nmfkc.rank):

    • Dual-Axis Visualization: The plot now displays fitting metrics (\(R^2\), etc.) on the left axis and ECV Sigma (RMSE) on the right axis (blue line).
    • Automatic Best Rank labeling: The plot explicitly marks the “Best” rank based on two criteria:
      • Elbow: Geometric elbow point of the \(R^2\) curve.
      • Min: Minimum error point of the Element-wise CV.
    • save.time defaults to FALSE, enabling the robust Element-wise CV calculation by default.
  • Argument Standardization:

    • Unified the rank argument name to rank across all functions (nmfkc, nmfkc.cv, nmfkc.ecv, nmfkc.rank).
    • The legacy argument Q is still supported for backward compatibility but internally mapped to rank.
  • Summary Improvements:

    • Updated summary() and print() methods to report:
      • Sparsity of Basis (\(X\)) and Coefficients (\(B\)).
      • Clustering Entropy (indicating “Crisp” vs “Ambiguous” clustering).
      • Clustering Crispness (Mean Max Probability).
      • Number and percentage of missing values in \(Y\).
  • Other Improvements:

    • Added a validation check in nmfkc.ar() to ensure the input Y has no missing values (as they cannot be propagated to the covariate matrix A in VAR models).
    • Refined nmfkc.residual.plot() layout margins for better visibility of titles.
    • Updated documentation to reflect all changes.
  • Regularization Update:
    The regularization scheme has been revised from L2 (ridge) to L1 (lasso-type) penalties.

    • gamma now controls the L1 penalty on the coefficient matrix ( B = C A ), promoting sparsity in sample-wise coefficients.
    • A new argument lambda has been added to control the L1 penalty on the parameter matrix ( C ), encouraging sparsity in the shared template structure.
      Both parameters can be passed through the ellipsis (...) to nmfkc() and related functions.
  • Function Signature Simplification:** Many less-frequently used arguments in nmfkc() (e.g., gamma, X.restriction, X.init) and in nmfkc.cv() (e.g., div, seed) have been moved into the ellipsis (...) for a cleaner function signature.

  • Performance Improvement: The internal function .silhouette.simple was vectorized and optimized to reduce computational cost, particularly for the calculation of a(i) and b(i).

  • Removed the fast.calc option from the nmfkc() function.

  • Added the X.init argument to the nmfkc() function, allowing selection between 'kmeans' and 'nndsvd' initialization methods.

  • The penalty term has been changed from tr(CC') to tr(BB') = tr(CAA'C').

  • Implemented the internal .z and xnorm functions.

  • Added the fast.calc option to the nmfkc() function.

  • Optimized internal calculations for improved performance.

  • Updated citation("nmfkc") and added AIC/BIC to the output.

  • Implemented the nmfkc.ar.stationarity() function.

  • Modified the z() function.

  • Used crossprod() for faster matrix multiplication.

  • Implemented the nmfkc.ar.DOT() function.

  • Added logic to sort the columns of X to form a unit matrix in special cases.

  • Implemented nmfkc.kernel.beta.cv() and nmfkc.ar.degree.cv() functions.

  • Set the default column names of X to Basis1, Basis2, etc.

  • Added X.prob and X.cluster to the return object.

  • Skipped CPCC and silhouette calculations when save.time = TRUE.

  • Added a prototype for the nmfkc.ar() function.

  • Added the criterion argument to the nmfkc() function to support multiple criteria.

  • Updated the nmfkc.rank() function.

  • Added the criterion argument to the nmfkc.rank() function.

  • Implemented the save.time argument.

  • Implemented the nmfkc.rank() function.

  • Implemented the nstart option from the kmeans() function.

  • Added an experimental implementation of the nmfkc.rank() function.

  • Removed zero-variance columns and rows with a warning.

  • Added source and references to the documentation.

  • Renamed several components for clarity:

    • nmfkcreg to nmfkc
    • create.kernel to nmfkc.kernel
    • nmfkcreg.cv to nmfkc.cv
    • P to B.prob
    • cluster to B.cluster
    • unit to X.column
    • trace to print.trace
    • dims to print.dims
  • Added the r.squared argument to the nmfkcreg.cv() function.

  • In nmfkcreg():

    • Added the dims argument to check matrix sizes.
    • Added the unit argument to normalize the basis matrix columns.
  • Modified the create.kernel() function to support prediction.

  • Updated examples on GitHub.

  • Removed the YHAT return value; use XB instead.

  • Added the cluster return value for hard clustering.