Poster
in
Workshop: Algorithmic Fairness Across Alignment Procedures and Agentic Systems

SOMnibus: Recovering Underlying Sensitive Attributes with Self-Organizing Maps

Joseph Bingham ⋅ Netanel Arussy ⋅ Dvir Aran

Project Page [ OpenReview]

Abstract

Unsupervised representation learning is often assumed to be benign with respect to sensitive attributes when those attributes are withheld from training. We challenge this assumption by demonstrating systematic \emph{representation-level leakage} of ordinal sensitive attributes in purely unsupervised embeddings. Using \textbf{SOMnibus}, a topology-preserving method based on high-capacity Self-Organizing Maps, we show that attributes such as age and income emerge as dominant latent axes despite being explicitly excluded from the input. Across two large-scale real-world benchmarks, the World Values Survey and the Census-Income (KDD) dataset, SOMnibusrecovers monotonic orderings aligned with withheld sensitive attributes, achieving Spearman correlations of up to $0.85$, while PCA and UMAP typically remain below $0.23$ . Moreover, unsupervised segmentation of SOMnibus embeddings yields demographically skewed clusters, revealing downstream fairness risks in the absence of any supervised task. These results demonstrate that \emph{fairness through unawareness} can fail at the representation level.

Chat is not available.