Poster

Diving Segmentation Model into Pixels

Chen Gan · Zihao Yin · Kelei He · Yang Gao · Junfeng Zhang

2024 Poster

[ OpenReview]

Abstract

More distinguishable and consistent pixel features for each category will benefit the semantic segmentation under various settings.Existing efforts to mine better pixel-level features attempt to explicitly model the categorical distribution, which fails to achieve optimal due to the significant pixel feature variance.Moreover, prior research endeavors have scarcely delved into the thorough analysis and meticulous handling of pixel-level variance, leaving semantic segmentation at a coarse granularity.In this work, we analyze the causes of pixel-level variance and introduce the concept of $\textbf{pixel learning}$ to concentrate on the tailored learning process of pixels to handle the pixel-level variance, enhancing the per-pixel recognition capability of segmentation models.Under the context of the pixel learning scheme, each image is viewed as a distribution of pixels, and pixel learning aims to pursue consistent pixel representation inside an image, continuously align pixels from different images (distributions), and eventually achieve consistent pixel representation for each category, even cross-domains.We proposed a pure pixel-level learning framework, namely PiXL, which consists of a pixel partition module to divide pixels into sub-domains, a prototype generation, a selection module to prepare targets for subsequent alignment, and a pixel alignment module to guarantee pixel feature consistency intra-/inter-images, and inter-domains.Extensive evaluations of multiple learning paradigms, including unsupervised domain adaptation and semi-/fully-supervised segmentation, show that PiXL outperforms state-of-the-art performances, especially when annotated images are scarce.Visualization of the embedding space further demonstrates that pixel learning attains a superior representation of pixel features.The code is available at https://github.com/ChenGan-JS/PiXL.

Video

Chat is not available.