Augmenting X-ray Astronomical Representations with Scientific Knowledge through Contrastive Learning
Abstract
Astronomers have produced large multimodal datasets that include images, spectra, and time series, and that encode physical information about the observed objects. In addition, a large amount of physics-specific knowledge about these objects has been accumulated in the astronomical literature. We introduce a physics-informed representation alignment framework that matches X-ray observations of astrophysical objects and text summaries describing the physical properties of those sources. We perform contrastive learning between data representations learned using a Poisson process autodecoder and text summary representations generated with a Large Language Model. We demonstrate the generalization capabilities of the system and evaluate the performance of the post-alignment shared representations for regression tasks. We also present a use case for the physical interpretation of newly observed astrophysical sources.