Skip to contents

nmfkc 0.6.6

New Functions

  • nmfkc.criterion(): Extracted criterion computation from nmfkc() as a standalone exported function. Supports detail = "full" / "fast" / "minimal" to control computation cost.
  • nmfre.inference(): Separated statistical inference from nmfre() optimization. Returns coefficient table with SE, z-values, and p-values via wild bootstrap.
  • nmf.sem.inference(): Statistical inference for the C2 parameter matrix in NMF-SEM. Uses sandwich SE and wild bootstrap.
  • S3 methods coef(), fitted(), residuals() for all model classes (nmfkc, nmfae, nmfre, nmf.sem).
  • S3 methods plot() for nmfre and nmf.sem (convergence diagnostics).
  • summary.nmf.sem(): Stability diagnostics, fit statistics, and C2 coefficient table.

Parameter Renames (old names remain usable for backward compatibility)

Other Improvements

  • hide.isolated option added to all .DOT functions (default TRUE).
  • nmf.sem.DOT(): Added sig.level parameter; C2 edges decorated with significance stars.
  • nmfkc(): Added X.restriction = "none" option and X.init = "kmeansar" initialization.
  • Added arXiv/DOI references to roxygen documentation for all main functions.
  • @section Lifecycle: Experimental added to nmfae().
  • Removed mc.cores parallel option from nmfae.ecv() for CRAN compliance.

nmfkc 0.6.0

Bug Fixes

  • Fixed variable T shadowing TRUE in information criterion computation.
  • Fixed nmfkc.ecv() to use KL divergence for evaluation when method="KL".
  • Added performance flags (save.time=TRUE) to nmfkc.ecv() inner calls.
  • Fixed zero-division in nmfkc.rank() elbow normalization when R-squared values are identical.
  • Fixed parameter name mismatch (rankQ) in nmfkc.rank() call to nmfkc.ecv().
  • Fixed descending loop in nmf.sem.split() when P=2.
  • Added input validation for n.exogenous in nmf.sem.split().

Documentation

Code Quality

  • Replaced T/F with TRUE/FALSE.
  • Replaced 1:length() with seq_along().
  • Changed default font from Meiryo to Arial in DOT functions.
  • Aligned nmf.sem.cv() defaults with nmf.sem().

nmfkc 0.5.8

Graphviz DOT Output Consolidation and Cleanup

  • Harmonized all DOT-generating functions (nmf.sem.DOT, nmfkc.DOT, nmfkc.ar.DOT) for consistent structure, naming conventions, and visualization logic.

  • Standardized node and edge formatting rules, including unified cluster behavior, color schemes, and edge-scaling conventions.

  • Implemented threshold-aware coefficient labeling so that displayed numerical precision aligns with the visualization threshold, preventing misleadingly detailed labels.

  • Removed unused or redundant DOT fragments and improved compatibility across Graphviz engines.

  • Enhanced layout readability through consistent indentation, node grouping, and suppression of isolated nodes in specific visualization modes (e.g., type = "YA" in nmfkc.DOT).

  • Refactored and expanded internal DOT helper functions (.nmfkc_dot_format_coef, .nmfkc_dot_digits_from_threshold, .nmfkc_dot_cluster_nodes, etc.) for better maintainability and uniform behavior.

  • New Function: Implemented nmfkc.ecv() for Element-wise Cross-Validation (Wold’s CV).

    • This function randomly masks elements of the observation matrix to evaluate structural reconstruction error.
    • It provides a statistically robust criterion for rank selection, avoiding the monotonic error decrease often seen in standard column-wise CV.
    • Supports vector input for rank to evaluate multiple ranks simultaneously.
  • Missing Value & Weight Support:

    • nmfkc() and nmfkc.cv() now fully support missing values (NA) and observation weights via the hidden argument Y.weights (passed through ...).
    • If Y contains NAs, they are automatically detected and masked (assigned a weight of 0) during optimization.
  • Rank Selection Diagnostics (nmfkc.rank):

    • Dual-Axis Visualization: The plot now displays fitting metrics (\(R^2\), etc.) on the left axis and ECV Sigma (RMSE) on the right axis (blue line).
    • Automatic Best Rank labeling: The plot explicitly marks the “Best” rank based on two criteria:
      • Elbow: Geometric elbow point of the \(R^2\) curve.
      • Min: Minimum error point of the Element-wise CV.
    • save.time defaults to FALSE, enabling the robust Element-wise CV calculation by default.
  • Argument Standardization:

    • Unified the rank argument name to rank across all functions (nmfkc, nmfkc.cv, nmfkc.ecv, nmfkc.rank).
    • The legacy argument Q is still supported for backward compatibility but internally mapped to rank.
  • Summary Improvements:

    • Updated summary() and print() methods to report:
      • Sparsity of Basis (\(X\)) and Coefficients (\(B\)).
      • Clustering Entropy (indicating “Crisp” vs “Ambiguous” clustering).
      • Clustering Crispness (Mean Max Probability).
      • Number and percentage of missing values in \(Y\).
  • Other Improvements:

    • Added a validation check in nmfkc.ar() to ensure the input Y has no missing values (as they cannot be propagated to the covariate matrix A in VAR models).
    • Refined nmfkc.residual.plot() layout margins for better visibility of titles.
    • Updated documentation to reflect all changes.
  • Regularization Update:
    The regularization scheme has been revised from L2 (ridge) to L1 (lasso-type) penalties.

    • gamma now controls the L1 penalty on the coefficient matrix ( B = C A ), promoting sparsity in sample-wise coefficients.
    • A new argument lambda has been added to control the L1 penalty on the parameter matrix ( C ), encouraging sparsity in the shared template structure.
      Both parameters can be passed through the ellipsis (...) to nmfkc() and related functions.
  • Function Signature Simplification:** Many less-frequently used arguments in nmfkc() (e.g., gamma, X.restriction, X.init) and in nmfkc.cv() (e.g., div, seed) have been moved into the ellipsis (...) for a cleaner function signature.

  • Performance Improvement: The internal function .silhouette.simple was vectorized and optimized to reduce computational cost, particularly for the calculation of a(i) and b(i).

  • Removed the fast.calc option from the nmfkc() function.

  • Added the X.init argument to the nmfkc() function, allowing selection between 'kmeans' and 'nndsvd' initialization methods.

  • The penalty term has been changed from tr(CC') to tr(BB') = tr(CAA'C').

  • Implemented the internal .z and xnorm functions.

  • Added the fast.calc option to the nmfkc() function.

  • Optimized internal calculations for improved performance.

  • Updated citation("nmfkc") and added AIC/BIC to the output.

  • Implemented the nmfkc.ar.stationarity() function.

  • Modified the z() function.

  • Used crossprod() for faster matrix multiplication.

  • Implemented the nmfkc.ar.DOT() function.

  • Added logic to sort the columns of X to form a unit matrix in special cases.

  • Implemented nmfkc.kernel.beta.cv() and nmfkc.ar.degree.cv() functions.

  • Set the default column names of X to Basis1, Basis2, etc.

  • Added X.prob and X.cluster to the return object.

  • Skipped CPCC and silhouette calculations when save.time = TRUE.

  • Added a prototype for the nmfkc.ar() function.

  • Added the criterion argument to the nmfkc() function to support multiple criteria.

  • Updated the nmfkc.rank() function.

  • Added the criterion argument to the nmfkc.rank() function.

  • Implemented the save.time argument.

  • Implemented the nmfkc.rank() function.

  • Implemented the nstart option from the kmeans() function.

  • Added an experimental implementation of the nmfkc.rank() function.

  • Removed zero-variance columns and rows with a warning.

  • Added source and references to the documentation.

  • Renamed several components for clarity:

    • nmfkcreg to nmfkc
    • create.kernel to nmfkc.kernel
    • nmfkcreg.cv to nmfkc.cv
    • P to B.prob
    • cluster to B.cluster
    • unit to X.column
    • trace to print.trace
    • dims to print.dims
  • Added the r.squared argument to the nmfkcreg.cv() function.

  • In nmfkcreg():

    • Added the dims argument to check matrix sizes.
    • Added the unit argument to normalize the basis matrix columns.
  • Modified the create.kernel() function to support prediction.

  • Updated examples on GitHub.

  • Removed the YHAT return value; use XB instead.

  • Added the cluster return value for hard clustering.