Skip to contents

Single entry point for symmetric NMF of network data with correct multiplicative updates. Three model types are supported via type:

  • tri (type="tri", default): \(Y \approx X C X^\top\) with \(X, C \ge 0\) (both non-negative; \(C\) symmetric by design). Uses Frobenius-full bilateral gradient.

  • bi (type="bi"): \(Y \approx X X^\top\) (\(C\) fixed to \(I_Q\)), cube-root damping (He et al. 2011).

  • signed (type="signed"): \(Y \approx X (C_{+} - C_{-}) X^\top\) with \(X \ge 0\) and signed \(C = C_{+} - C_{-}\). Preserves the soft-clustering interpretation of \(X\) while allowing negative off-diagonals of \(C\) (inter-cluster repulsion).

Non-negative adjacency matrix assumption. All three types assume \(Y \ge 0\) (a non-negative adjacency/affinity matrix). The qualifier “signed” in type = "signed" refers to the middle coefficient \(C\), not to \(Y\) itself. The underlying Ding, Li & Jordan (2010) sign-splitting updates require \(Y \ge 0\) to guarantee monotone descent; supplying a signed \(Y\) triggers an error. For a signed data matrix, see nmfkc.signed.

Usage

nmfkc.net(
  Y,
  rank = 2,
  type = c("tri", "bi", "signed"),
  epsilon = 1e-04,
  maxit = 5000,
  verbose = FALSE,
  ...
)

Arguments

Y

Symmetric (N x N) non-negative adjacency matrix. NA entries are automatically treated as masked edges (equivalent to supplying Y.weights with 0 at those positions); see the note on Y.weights below.

rank

Integer Q.

type

"tri" (default), "bi", or "signed".

epsilon, maxit, verbose

Standard.

...

Hidden options: nstart (default 1; see note below), seed (default 123), X.restriction, X.init, C.init (tri only) or Cp.init/Cn.init (signed only), Y.weights, C.L1 (tri only), X.L2.ortho, prefix.

Y.weights is an optional non-negative N x N weight matrix (symmetric, same shape as Y). When supplied, the loss becomes \(\sum W_{ij} \, (Y_{ij} - \hat Y_{ij})^2\) (lm()-style, linear in \(W\)). Logical matrices (TRUE / FALSE) are also accepted. Typical usage by nmfkc.net.ecv is a binary mask (\(W \in \{0,1\}\)) holding out test edges on the upper triangle; real-valued weights for edge-level importance weighting are also supported. If Y.weights is NULL (default) and Y contains NA, a binary mask is auto-generated (0 at NA positions, 1 elsewhere), and the NA entries in Y are replaced by 0 so the multiplicative updates can proceed.

X.init controls the initialization of the N x Q basis matrix \(X\). Accepted values:

  • "kmeans" (default): k-means on the rows of \(Y\) (equivalently columns, since \(Y\) is symmetric); the Q cluster centers become the columns of \(X\). Each node is treated as an N-dimensional connectivity profile, so clusters correspond to nodes with similar neighborhood structure – essentially a fast proxy for spectral clustering (Kuang, Yun & Park 2015, SymNMF). Scales well and is the recommended default for network data.

  • "kmeansar": "kmeans" followed by filling zero entries of \(X\) with \(\mathrm{Uniform}(0, \bar Y / 100)\) to escape trivial stationary points.

  • "nndsvd": Non-negative Double SVD with additive randomness (NNDSVDar). Requires a full SVD of \(Y\), so for very large networks (N > a few thousand) "kmeans" is preferable.

  • "runif": Uniform random entries in \([0, 1]\).

  • "random": Legacy default (pre-v0.6.8), equivalent to abs(rnorm(N * Q)) * 0.1. Kept for backward compatibility.

  • A numeric N x Q matrix supplied by the user (used as-is).

When nstart > 1, each restart uses a distinct seed so that k-means / runif / NNDSVDar produce different candidate initial values across the multi-start loop.

Multi-start recommendation. For type = "signed" the \(C = C_{+} - C_{-}\) bottleneck can take both positive and negative values, so the objective has more local minima than for "tri" or "bi". A larger nstart (e.g., 10-50) is recommended during exploration to reduce the chance of being trapped at a suboptimal stationary point. The default 1 is intended for fast development; raise for publication-grade runs.

Value

Object of class c("nmfkc.net.<type>", "nmfkc.net", "nmfkc"). For type = "signed" the return also carries $Cp, $Cn.

Lifecycle

This function is experimental. The interface may change in future versions; details are to be described in an upcoming paper.