Poster session A
in
Workshop: ICLR 2025 Workshop on GenAI Watermarking (WMARK)
Are Watermarks For Diffusion Models Radioactive?
Jan DubiĆski · Michel Meintz · Franziska Boenisch · Adam Dziedzic
As generative artificial intelligence (AI) models become increasingly widespread, ensuring transparency and provenance in AI-generated content has become a critical challenge. Watermarking techniques have been proposed to embed imperceptible yet detectable signals in AI-generated images, enabling provenance tracking and copyright enforcement. However, a second party can repurpose images generated by an existing model to train their own diffusion model, potentially disregarding the ownership rights of the original model creator. Recent research in language models has explored the concept of watermark \textit{radioactivity}, where embedded signals persist when training or fine-tuning a new model, enabling the detection of models trained on watermarked data. In this work, we investigate whether similar persistence occurs in diffusion models. Our findings reveal that none of the tested watermarking methods transfer their signal when used for fine-tuning a second model. This means that images generated by this new model exhibit detection results for the watermarks of the original model indistinguishable from random guessing. These results indicate that existing techniques are insufficient for ensuring watermark propagation through the model derivation chain and that novel approaches are needed to achieve effective and resilient watermark transfer in diffusion models.