Other languages: Русский • فارسی
Overview
CANUS is a clustering algorithm for attributed networks that jointly uses node attributes X and adjacency/row-vectors A. It learns centroids in both spaces and assigns nodes by minimizing a weighted joint distance. A variance-aware filtering mechanism (optional) aggregates updates only from samples whose per-sample gradient norms fall within a data-driven band.
Key features
- Joint clustering with attributes
Xand adjacency rowsA - Cosine/Minkowski/Canberra distances
- KMeans++-style joint initialization
- Optional variance-aware filtering of updates
- PyTorch implementation; JAX skeleton for experimentation
Quickstart (PyTorch)
python canus.pyProgrammatic usage:
import numpy as np
from canus import CANUSClusterer
N, D, K = 100, 8, 3
X = np.random.randn(N, D).astype(np.float32)
A = np.random.rand(N, N).astype(np.float32)
A = (A + A.T) / 2.0
model = CANUSClusterer(n_clusters=K, epochs=50)
model.fit(X, A)
print(model.y_pred)JAX version (experimental)
python canus_jax.py # requires jaxDatasets
Example parquet/CSV datasets are under datasets/. See CANUS_quickstart.ipynb for a minimal walkthrough.
- New: movies_network (MSMN) — medium-scale movie network with TF-IDF content features and main-genre labels. Folder: Datasets/movies_network (MSMN)
Citations
- Shalileh S. A filtered gradient descent clustering method to recover communities in attributed networks. IEEE Access. 2025 Sep 26.
- Paper link:
https://ieeexplore.ieee.org/stamp/redirect.jsp?arnumber=/6287639/6514899/11180948.pdf
— Author: Soroosh Shalileh (sr.shalileh@gmail.com)