Changelog
Source:NEWS.md
nmfkc 0.7.2
Headline: NMF-FFB rebrand and full bootstrap inference
-
nmf.ffb*family added as the canonical alias fornmf.sem*(Satoh 2025, arXiv:2512.18250 adopts “NMF-FFB” — Non-negative Matrix Factorization with Feed-Forward + Feedback — as the model’s canonical name).nmf.sem*continues to work and shares the same return classes (c("nmf.ffb", "nmf.sem")andc("nmf.ffb.inference", "nmf.sem.inference", ...)), so existing scripts are unaffected. -
nmf.sem.inference()/nmf.ffb.inference(): replaced the legacy 1-step Newton wild bootstrap with a full X-fixed pair bootstrap. Resamples columns of (Y1, Y2), refits (C1, C2) with X held at the original fit, and reports per-elementsupport_rate = mean(|c_b| > threshold)together with percentile CIs. Significance markers (*/**/***at sup > 0.95 / 0.99 / 0.999) follow the lavaan convention. Both Theta_1 (feedback) and Theta_2 (exogenous) are inference targets (previous version covered only Theta_2). -
nmf.sem()/nmf.ffb(): now runsnmfkc(Y1, A = Y2)internally by default whenX.initis a string method, forwardingX.init,X.L2.ortho,epsilon,maxit,seed. The feedforward fit is used both as the X warm-start and as the baseline forSC.map.nmfkc.baseline = FALSEopts out.
Bug Fixes
-
nmf.sem.inference(): fixed dimension bug in the Leontief identity matrix (I_mat <- diag(Q)should have beendiag(P1)); previously every replicate was silently marked invalid whenP1 != Q. -
nmfkc.net(): now auto-masks NA entries ofY(parity with the other four NMF variants); previously errored at themin(Y) < 0check whenYcontained NA. -
nmfkc(): Fixed C matrix asymmetry in tri-symmetric NMF (Y.symmetric = "tri"). The C update was using stale B and XB computed from the old X; now B and XB are recomputed after X is updated. Also fixed column reordering to permute both rows and columns of C. Previously the relative asymmetry could reach ~46%; now it is at machine precision (~1e-14).
Improvements
-
Y.weightssemantics unified tolm()-style weighted least squares acrossnmfkc(),nmfae(),nmfkc.net(),nmfkc.signed(),nmfae.signed(): loss is nowsum(W * (Y - Yhat)^2)(linear in W, matchinglm()’sweightsargument). Binary masks (W ∈ {0, 1}; the standard ECV / NA-mask case) are unaffected since W = W^2. - All MU functions now emit a
"maximum iterations (N) reached..."warning whenmaxitis exhausted without meeting the relative- tolerance criterion (previously silent innmfae,nmfae.signed,nmfkc.net,nmfkc.signed,nmfre, andnmf.sem). - All MU functions now share
maxit = 5000as the default (was 5000 / 20000 / 50000 inconsistently). Together with the maxit warning above, users see explicit feedback when 5000 is insufficient and can opt into a larger cap. - New shared internal helper
.init_X_method()for X initialization via"nndsvd"/"kmeans"/"kmeansar"/"runif"/ numeric matrix. All NMF families now use the same dispatch logic; previous ad-hoc inline implementations are removed. -
nmf.sem()returnsSC.map(input-output structural fidelity: correlation between the equilibrium operator and the feedforward baseline mapping; Satoh 2025 §4.SC.map) automatically whennmfkc.baselineis supplied or computed internally. -
summary.nmf.sem(): rewritten to display the full-bootstrap inference output — separate Theta_1 / Theta_2 blocks withEstimate | CI_low | CI_high | support | Pr(>0) | sig, plus a bootstrap meta-info header. -
coef.nmf.sem(): now returns a long-format data frame with rows for every entry of both C1 and C2 (Type | Basis | Covariate | Estimate); previously returned only the C2 matrix when no inference had been run. Schema matches the inference-augmented output for uniformity. -
plot.nmf.sem(): default trace is nowobjfunc.full(loss + penalties — the actual monotonically-decreasing quantity that the multiplicative updates minimize) instead ofobjfunc(reconstruction only). New argumentwhich = "full" | "reconstruction" | "both". -
nmf.sem.DOT(): significance stars now appear on Theta_1 (feedback Y1 → F) edges in addition to Theta_2 (exogenous Y2 → F); X (F → Y1) edges remain unstarred since the basis is not the inference target. -
plot.nmfae.ecv(): Heatmap cell text color is now always black for better readability on light-colored cells. -
nmfkc():X.init = "runif"now supportsnstart > 1for multi-start initialization. Multiple random starting points are evaluated with 10 standard NMF iterations, and the best (lowest Frobenius error) is selected. -
nmfae(),nmfre():r.squaredis now computed ascor(Y, fitted)^2(squared correlation between observed and fitted values), consistent withnmfkc(). Previouslynmfae()used1 - SS_res/SS_totandnmfre()used the same regression-style R-squared, which can behave unexpectedly for intercept-free non-negative models. -
nmfkc.kernel.beta.nearest.med(): added acandidatesargument controlling the bandwidth grid. Options:"7points"(new default,t = {-1,-2/3,-1/3,0,1/3,2/3,1}),"4points"(t = {-1/2, 0, 1/2, 1}), or a user-supplied numeric vector of values. Previously the grid silently differed between the no-landmark (Uk = NULL; 4 points) and landmark (7 points) branches.
New Functions (Signed NMF family)
-
nmfkc.signed(): NMF-KC with signed covariate/coefficient. Model with , (signed), real-valued. Uses Ding et al. (2010) sign-splitting + Direct MU; may also contain negative entries (semi-NMF regression). SupportsY.weightsfor element-wise masking. -
nmfkc.signed.cv(),nmfkc.signed.ecv(): column-wise and element-wise k-fold CV for rank selection on signed data. -
nmfae.signed(): Three-layer autoencoder with . preserve soft clustering on both decoder and encoder sides while the bottleneck can carry negative weights (e.g., anti-correlated properties). Hybrid warm-start (fromnmfae()) + Direct MU with multi-restart. -
nmfae.signed.ecv(): element-wise CV for (decoder-rank, encoder-rank) selection. -
nmfae.signed.inference(): sandwich SE + wild bootstrap for (no non-negativity projection on since it is signed). - S3 methods
predict.*.signed(),plot.*.signed(),summary.*.signed(), andnmfae.signed.rename()helper.
New Functions (Network NMF family)
-
nmfkc.net(): Single unified entry point for symmetric NMF of network data, withtype = "tri" | "bi" | "signed". All three variants use the Frobenius-full bilateral gradient (supersedes the one-sided approximation innmfkc(Y.symmetric = ...)).type = "signed"supports signed via Ding et al. (2010) sign-splitting, preserving for soft clustering while allowing inter-cluster repulsion. The returned object’s fields are uniform across types: and are for tri/bi, and populated matrices for signed. is always populated (identity for bi, non-negative for tri, signed for signed). -
nmfkc.net.ecv(): Element-wise cross-validation with upper-triangle folds (mirrored to the lower triangle to prevent symmetry leakage). Unified entry point fortype = "tri" | "bi" | "signed"(callsnmfkc.net()with the matchingtypefor each fold). -
nmfkc.net.DOT(): Graphviz DOT visualization for symmetric NMF networks. Displays basis-to-node membership edges and inter-basis interaction edges (C matrix) with significance stars. Now hassignedparameter (auto-detected from class) to render negativeCentries as dashed edges. -
nmfkc.net.inference(): Statistical inference for symmetric NMF. Wrapper aroundnmfkc.inference()withA = t(X). Returns off-diagonal C coefficients with sandwich SE and wild bootstrap.
Deprecations
-
nmfkc(Y, Y.symmetric = "bi"|"tri"): Deprecated in favor ofnmfkc.net(Y, type = "bi"|"tri"). The old implementation uses a one-sided gradient approximation that empirically converges for but is theoretically incorrect and does not extend to signed . The deprecated branch still works in v0.6.8 (with a deprecation warning) and will be removed in a future release.
Parameter Renames (old names remain usable for backward compatibility)
-
nmf.sem.DOT():weight_scale_y2f→weight_scale_c2,weight_scale_fy1→weight_scale_x1(matrix-name-based naming, consistent withnmfae.DOT()andnmfkc.DOT()). -
nmf.sem.DOT():sig.levelmoved to afterthresholdfor consistency with other.DOTfunctions.
Documentation
- README, vignettes, and roxygen
@title/@descriptionupdated to use NMF-FFB as the canonical model name (with “(formerly NMF-SEM)” attached on first mention for discoverability of the legacy term). File names (R/nmf.sem.R,vignettes/nmf-sem-with- nmfkc.Rmd,man/nmf.sem.Rd), function names (nmf.sem*), and S3 classes ("nmf.sem") are unchanged so URLs and existing scripts continue to work.
nmfkc 0.6.7
CRAN release: 2026-04-15
Bug Fixes
- Added
fitted.nmfae()andresiduals.nmfae()S3 methods; previouslyfitted()on annmfaeobject silently returnedNULLbecause the wrong field name ($XBinstead of$Y1hat) was used.
Naming Unification (old names remain usable for backward compatibility)
- Coefficient tables: all inference functions now use
Basis/Covariatecolumns (wasFactor/Exogenousinnmf.sem.inference(),Decoder/Encoderinnmfae.inference()). - Wild bootstrap defaults unified:
wild.B = 500,wild.seed = 123across all inference functions. - First argument of all
.DOTfunctions renamed toresultfor consistency. - CV tuning parameters (
nfolds,seed,shuffle) moved to...innmfkc.ecv(),nmfae.ecv(),nmfae.cv(),nmf.sem.cv();divalso accepted for backward compatibility.
nmfkc 0.6.6
New Functions
-
nmfkc.criterion(): Extracted criterion computation fromnmfkc()as a standalone exported function. Supportsdetail = "full"/"fast"/"minimal"to control computation cost. -
nmfre.inference(): Separated statistical inference fromnmfre()optimization. Returns coefficient table with SE, z-values, and p-values via wild bootstrap. -
nmf.sem.inference(): Statistical inference for the C2 parameter matrix in NMF-SEM. Uses sandwich SE and wild bootstrap. - S3 methods
coef(),fitted(),residuals()for all model classes (nmfkc,nmfae,nmfre,nmf.sem). - S3 methods
plot()fornmfreandnmf.sem(convergence diagnostics). -
summary.nmf.sem(): Stability diagnostics, fit statistics, and C2 coefficient table.
Parameter Renames (old names remain usable for backward compatibility)
-
nmfkc(),nmfkc.rank():save.time/save.memory→detail -
nmfae():Q→rank,R→rank.encoder -
nmfre():Q→rank,dfU.cap.rate→df.rate -
nmfre.dfU.scan(),nmfkc.ar.degree.cv():Q→rank -
nmfkc.residual.plot():Y_XB_palette→fitted.palette,E_palette→residual.palette -
nmfkc.kernel.beta.nearest.med():block_size→block.size,sample_size→sample.size
Other Improvements
-
hide.isolatedoption added to all.DOTfunctions (defaultTRUE). -
nmf.sem.DOT(): Addedsig.levelparameter; C2 edges decorated with significance stars. -
nmfkc(): AddedX.restriction = "none"option andX.init = "kmeansar"initialization. - Added arXiv/DOI references to roxygen documentation for all main functions.
-
@section Lifecycle: Experimentaladded tonmfae(). - Removed
mc.coresparallel option fromnmfae.ecv()for CRAN compliance.
nmfkc 0.6.0
Bug Fixes
- Fixed variable
TshadowingTRUEin information criterion computation. - Fixed
nmfkc.ecv()to use KL divergence for evaluation whenmethod="KL". - Added performance flags (
save.time=TRUE) tonmfkc.ecv()inner calls. - Fixed zero-division in
nmfkc.rank()elbow normalization when R-squared values are identical. - Fixed parameter name mismatch (
rank→Q) innmfkc.rank()call tonmfkc.ecv(). - Fixed descending loop in
nmf.sem.split()when P=2. - Added input validation for
n.exogenousinnmf.sem.split().
Documentation
- Added roxygen documentation for
summary.nmfkc()andprint.summary.nmfkc(). - Added
@returnforplot.nmfkc()andpredict.nmfkc(). - Added missing
@returnitems (method,n.missing,n.total,rank,mae) tonmfkc().
Code Quality
- Replaced
T/FwithTRUE/FALSE. - Replaced
1:length()withseq_along(). - Changed default font from Meiryo to Arial in DOT functions.
- Aligned
nmf.sem.cv()defaults withnmf.sem().
nmfkc 0.5.8
Graphviz DOT Output Consolidation and Cleanup
Harmonized all DOT-generating functions (
nmf.sem.DOT,nmfkc.DOT,nmfkc.ar.DOT) for consistent structure, naming conventions, and visualization logic.Standardized node and edge formatting rules, including unified cluster behavior, color schemes, and edge-scaling conventions.
Implemented threshold-aware coefficient labeling so that displayed numerical precision aligns with the visualization threshold, preventing misleadingly detailed labels.
Removed unused or redundant DOT fragments and improved compatibility across Graphviz engines.
Enhanced layout readability through consistent indentation, node grouping, and suppression of isolated nodes in specific visualization modes (e.g.,
type = "YA"innmfkc.DOT).Refactored and expanded internal DOT helper functions (
.nmfkc_dot_format_coef,.nmfkc_dot_digits_from_threshold,.nmfkc_dot_cluster_nodes, etc.) for better maintainability and uniform behavior.-
New Function: Implemented
nmfkc.ecv()for Element-wise Cross-Validation (Wold’s CV).- This function randomly masks elements of the observation matrix to evaluate structural reconstruction error.
- It provides a statistically robust criterion for rank selection, avoiding the monotonic error decrease often seen in standard column-wise CV.
- Supports vector input for
rankto evaluate multiple ranks simultaneously.
-
Missing Value & Weight Support:
-
nmfkc()andnmfkc.cv()now fully support missing values (NA) and observation weights via the hidden argumentY.weights(passed through...). - If
YcontainsNAs, they are automatically detected and masked (assigned a weight of 0) during optimization.
-
-
Rank Selection Diagnostics (
nmfkc.rank):- Dual-Axis Visualization: The plot now displays fitting metrics (\(R^2\), etc.) on the left axis and ECV Sigma (RMSE) on the right axis (blue line).
-
Automatic Best Rank labeling: The plot explicitly marks the “Best” rank based on two criteria:
- Elbow: Geometric elbow point of the \(R^2\) curve.
- Min: Minimum error point of the Element-wise CV.
-
save.timedefaults toFALSE, enabling the robust Element-wise CV calculation by default.
-
Argument Standardization:
- Unified the rank argument name to
rankacross all functions (nmfkc,nmfkc.cv,nmfkc.ecv,nmfkc.rank). - The legacy argument
Qis still supported for backward compatibility but internally mapped torank.
- Unified the rank argument name to
-
Summary Improvements:
-
Other Improvements:
- Added a validation check in
nmfkc.ar()to ensure the inputYhas no missing values (as they cannot be propagated to the covariate matrixAin VAR models). - Refined
nmfkc.residual.plot()layout margins for better visibility of titles. - Updated documentation to reflect all changes.
- Added a validation check in
-
Regularization Update:
The regularization scheme has been revised from L2 (ridge) to L1 (lasso-type) penalties.-
gammanow controls the L1 penalty on the coefficient matrix ( B = C A ), promoting sparsity in sample-wise coefficients. - A new argument
lambdahas been added to control the L1 penalty on the parameter matrix ( C ), encouraging sparsity in the shared template structure.
Both parameters can be passed through the ellipsis (...) tonmfkc()and related functions.
-
Function Signature Simplification:** Many less-frequently used arguments in
nmfkc()(e.g.,gamma,X.restriction,X.init) and innmfkc.cv()(e.g.,div,seed) have been moved into the ellipsis (...) for a cleaner function signature.Performance Improvement: The internal function
.silhouette.simplewas vectorized and optimized to reduce computational cost, particularly for the calculation ofa(i)andb(i).Removed the
fast.calcoption from thenmfkc()function.Added the
X.initargument to thenmfkc()function, allowing selection between'kmeans'and'nndsvd'initialization methods.The penalty term has been changed from
tr(CC')totr(BB')=tr(CAA'C').Implemented the internal
.zandxnormfunctions.Added the fast.calc option to the
nmfkc()function.Optimized internal calculations for improved performance.
Updated
citation("nmfkc")and added AIC/BIC to the output.Implemented the
nmfkc.ar.stationarity()function.Modified the
z()function.Used
crossprod()for faster matrix multiplication.Implemented the
nmfkc.ar.DOT()function.Added logic to sort the columns of
Xto form a unit matrix in special cases.Implemented
nmfkc.kernel.beta.cv()andnmfkc.ar.degree.cv()functions.Set the default column names of
XtoBasis1,Basis2, etc.Added
X.probandX.clusterto the return object.Skipped CPCC and silhouette calculations when
save.time = TRUE.Added a prototype for the
nmfkc.ar()function.Added the
criterionargument to thenmfkc()function to support multiple criteria.Updated the
nmfkc.rank()function.Added the
criterionargument to thenmfkc.rank()function.Implemented the
save.timeargument.Implemented the
nmfkc.rank()function.Implemented the
nstartoption from thekmeans()function.Added an experimental implementation of the
nmfkc.rank()function.Removed zero-variance columns and rows with a warning.
Added source and references to the documentation.
-
Renamed several components for clarity:
-
nmfkcregtonmfkc -
create.kerneltonmfkc.kernel -
nmfkcreg.cvtonmfkc.cv -
PtoB.prob -
clustertoB.cluster -
unittoX.column -
tracetoprint.trace -
dimstoprint.dims
-
Added the
r.squaredargument to thenmfkcreg.cv()function.-
In
nmfkcreg():- Added the
dimsargument to check matrix sizes. - Added the
unitargument to normalize the basis matrix columns.
- Added the
Modified the
create.kernel()function to support prediction.Updated examples on GitHub.
Removed the
YHATreturn value; useXBinstead.Added the
clusterreturn value for hard clustering.