Contrastive Learning of Extragalactic Stellar Streams: Sculpting a Latent Space of Representations with DES DR2 Photometry

Ernesto Benitez-Walz, Jelle Mes, Juan Miró-Carretero, Koen Kuijken, Amina Helmi

arXiv:2601.23013·astro-ph.GA·Published 2026-01-30

We present a self-supervised approach for characterizing low surface brightness tidal features in wide-field imaging data by applying the nearest-neighbor contrastive learning of visual representations (NNCLR) algorithm to a curated subset of the Dark Energy Survey Data Release 2 (DES DR2). We construct 38,334 cutouts of well-resolved galaxies in the g, r, i bands, applying a novel "tiered sigmoid scaling function" to dynamically adjust image contrast according to the object's signal-to-noise and background level. A supplemental labeled sample of 366 galaxies enables qualitative assessment of the learned embeddings. We train a convolutional neural network with image augmentations including injection of simulated background stars, and project the resulting 512-dimensional representations into two dimensions using uniform manifold approximation and projection (UMAP) and its local density preserving variant (densMAP). We find that the NNCLR latent space recovers global trends corresponding to major merger features, yet does not reliably separate stellar streams without further supervision. To interpret the network's implicit attention, we compute gradient-based saliency maps averaged over the full dataset: these reveal that the tiered sigmoid scaling effectively attenuates information from the center of the image cutouts, thereby suppressing the learning of high surface brightness features of each image cutout's central galaxy. Our study provides a blueprint for leveraging contrastive methods to mine forthcoming survey data for faint tidal substructure, and highlights key preprocessing and interpretability considerations for robust stream detection.

TopicsAstrophysics & Cosmology

Tagsdark-energy

arXiv categoriesastro-ph.GA

arXiv abstract page PDF