Poster
in
Workshop: ICLR 2025 Workshop on Bidirectional Human-AI Alignment
Envision Human-AI Perceptual Alignment from a Multimodal Interaction Perspective
Shu Zhong · Marianna Obrist
Aligning AI with human intent has seen progress, yet perceptual alignment—how AI interprets what we see, hear, feel, or smell—remains underexplored. This paper advocates for expanding perceptual alignment efforts across multimodal sensory modalities, such as touch and smell, which are critical for how humans perceive and interpret their environment. We envision AI systems enabling natural, multisensory interactions in everyday contexts, such as selecting clothing that aligns with both temperature and texture preferences or recreating rich sensory ambiances that evoke specific sights, sounds, and smells. By advancing multimodal representation learning and perceptual alignment, this work aims to inspire the computer science and human-computer interaction (HCI) communities to design inclusive, human-centered AI systems for everyday, multisensory experiences.