Fri 1:45 a.m. - 3:45 a.m.

SESSION-AWARE PRODUCT FILTER RANKING IN E- COMMERCE SEARCH ( Poster #282 ) > link

Poster Location: Halle B #282

Product filters are commonly used by e-commerce websites to refine search results based on attribute values such as price, brand, size, etc. However, existing filter recommendation approaches typically generate filters independently of the user's search query or browsing history. This can lead to suboptimal recommendations that do not account for what the user has already viewed or selected in their current browsing session. In this paper, we propose a session-aware product filter recommendation framework that leverages user's past actions to provide filter recommendations. An offline evaluation demonstrates that our model achieved significant improvement over non-contextual baseline models.

Link

Hanqing Lu · Xianfeng Tang · Chen Luo · Limeng Cui · Zhenwei DAI · Rahul Goutam · Haiyang Zhang · Monica Cheng 🔗

Fri 1:45 a.m. - 3:45 a.m.

Improving Image Editing Models with Generative Data Refinement ( Poster #281 ) > link

Poster Location: Halle B #281

Instruction-based generative image editing models allow an image to be modified based on a text prompt and have the potential to significantly improve the accessibility of image processing software. Like other generative models, they are highly dependent on the quality of their training dataset, and generating good editing datasets is an expensive task. In this paper, we show that a simple refinementof the original InstructPix2Pix (Brooks et al., 2023) dataset using SDXL (Podell et al., 2023) leads to consistent improvements in downstream models. We finetune SDXL on our refined dataset and observe competitive performance to much more cost-intensive methods. We will make the dataset and models publicly available.

Link

Frederic Boesel · Robin Rombach 🔗

Fri 1:45 a.m. - 3:45 a.m.

On Difficulties of Attention Factorization through Shared Memory ( Poster #246 ) > link

Poster Location: Halle B #246

Transformers have revolutionized deep learning in numerous fields, including natural language processing, computer vision, and audio processing. Their strength lies in their attention mechanism, which allows for the discovering of complex input relationships. However, this mechanism's quadratic time and memory complexity pose challenges for larger inputs. Researchers are now investigating models like Linear Unified Nested Attention (Luna) or Memory Augmented Transformer, which leverage external learnable memory to either reduce the attention computation complexity down to linear, or to propagate information between chunks in chunk-wise processing. Our findings challenge the conventional thinking on these models, revealing that interfacing with the memory directly through an attention operation is suboptimal, and that the performance may be considerably improved by filtering the input signal before communicating with memory.

Link

Uladzislau Yorsh · Martin Holeňa · Ondřej Bojar · David Herel 🔗

Fri 1:45 a.m. - 3:45 a.m.

KFC: Knowledge Reconstruction and Feedback Consolidation Enable Efficient and Effective Continual Generative Learning ( Poster #247 ) > link

Poster Location: Halle B #247

To address the issues of catastrophic forgetting in Continual Generative Learning (CGL), dominant methods leverage the generative replay strategy. However, they often suffer from high time complexity and inferior generative sample quality. In this work, we develop an efficient and effective CGL method via Knowledge reconstruction and Feedback Consolidation (KFC). KFC extends the inherent data reconstruction properties of the variational autoencoder framework to historical knowledge reconstruction and re-encodes the current task's reconstructed data to the same posterior distribution as the original data. Experiments showcase that KFC achieves state-of-the-art performances in time complexity, sample quality, and accuracy on various CGL tasks. Code is available in Supplementary Materials.

Link

Libo Huang · Zhulin An · Yan Zeng · xiang zhi · Yongjun Xu 🔗

Fri 1:45 a.m. - 3:45 a.m.

Zero-shot generalization across architectures for visual classification ( Poster #248 ) > link

Poster Location: Halle B #248

Generalization to unseen data is a key desideratum for deep networks, but its relation to classification accuracy is unclear.Using a minimalist vision dataset and a measure of generalizability, we show that popular networks, from deep convolutional networks (CNNs) to transformers, vary in their power to extrapolate to unseen classes both across layers and across architectures.Accuracy is not a good predictor of generalizability, and generalization varies non-monotonically with layer depth.

Link

Evan Gerritz · Luciano Dyballa · Steven Zucker 🔗

Fri 1:45 a.m. - 3:45 a.m.

Non Parametric Aleatoric Uncertainty Quantification with Neural Networks ( Poster #249 ) > link

Poster Location: Halle B #249

Classic methods for aleatoric uncertainty quantification in regression settings make assumptions about the distribution of noise in the dependent variable. Incorrect assumptions can lead to poor model performance and unreliable uncertainty estimates. In this paper, we introduce a simple method for non-parametric aleatoric uncertainty quantification. In particular, we train a neural network model for binary classification. The inputs to our binary classifier are the independent variables and a sample from the marginal distribution of the dependent variable. This binary classifier is trained to predict whether the sample from the marginal distribution of the dependent variable is greater than the dependent variable corresponding to independent variables in the input. Our method can be used for not only quantifying aleatoric uncertainty but also estimating the conditional distribution of the dependent variable.

Link

Kshitij Kapoor · Debayan Gupta 🔗

Fri 1:45 a.m. - 3:45 a.m.

Bypassing the Safety Training of Open-Source LLMs with Priming Attacks ( Poster #250 ) > link

Poster Location: Halle B #250

With the recent surge in popularity of LLMs has come an ever-increasing need for LLM safety training. In this paper, we investigate the fragility of SOTA open-source LLMs under simple, optimization-free attacks we refer to as *priming attacks*, which are easy to execute and effectively bypass alignment from safety training. Our proposed attack improves the Attack Success Rate on Harmful Behaviors, as measured by Llama Guard, by up to $3.3\times$ compared to baselines.

Link

Jason Vega · Isha Chaudhary · Changming Xu · Gagandeep Singh 🔗

Fri 1:45 a.m. - 3:45 a.m.

Reward Bound for Behavioral Guarantee of Model-based Planning Agents ( Poster #251 ) > link

Poster Location: Halle B #251

Recent years have seen an emerging interest in the Verification and Validation (V\&V) of machine learning-based agents in the wild, especially in robotics, to provide safety assurance for the industry. Obtaining behavioral guarantees for these agents remains an important problem. In this work, we focus on guaranteeing a model-based planning agent reaches a goal state within a specific future time step. We show that there exists a lower bound for the reward at the goal state, such that if the said reward is below that bound, it is impossible to obtain such a guarantee. By extension, we show how to enforce preferences over multiple goals.

Link

Zhiyu An · Xianzhong Ding · Wan Du 🔗

Fri 1:45 a.m. - 3:45 a.m.

Partial Rankings of Optimizers ( Poster #252 ) > link

Poster Location: Halle B #252

We introduce a framework for benchmarking optimizers according to multiple criteria over various test functions. Based on a recently introduced union-free generic depth function for partial orders/rankings, it fully exploits the ordinal information and allows for incomparability. Our method describes the distribution of all partial orders/rankings, avoiding the notorious shortcomings of aggregation. This permits to identify test functions that produce central or outlying rankings of optimizers and to assess the quality of benchmarking suites.

Link

Julian Rodemann · Hannah Blocher 🔗

Fri 1:45 a.m. - 3:45 a.m.

Colorful Cutout: Enhancing Image Data Augmentation with Curriculum Learning ( Poster #253 ) > link

Poster Location: Halle B #253

Data augmentation is one of the regularization strategies for the training of deep learning models, which enhances generalizability and prevents overfitting, leading to performance improvement. Although researchers have proposed various data augmentation techniques, they often lack consideration for the difficulty of augmented data. Recently, another line of research suggests incorporating the concept of curriculum learning with data augmentation in the field of natural language processing. In this study, we adopt curriculum data augmentation for image data augmentation and propose colorful cutout, which gradually increases the noise and difficulty introduced in the augmented image. Our experimental results highlight the possibility of curriculum data augmentation for image data. We publicly released our source code to improve the reproducibility of our study.

Link

Juhwan Choi · Youngbin Kim 🔗

-

Averaging Rate Scheduler for Decentralized Learning on Heterogeneous Data ( poster ) > link

Poster Location: #254

Presently, state-of-the-art decentralized learning algorithms typically require the data distribution to be Independent and Identically Distributed (IID). However, in practical scenarios, the data distribution across the agents can have significant heterogeneity. In this work, we propose averaging rate scheduling as a simple yet effective way to reduce the impact of heterogeneity in decentralized learning. Our experiments illustrate the superiority of the proposed method (~3 improvement in test accuracy) compared to the conventional approach of employing a constant averaging rate.

Link

Sai Aparna Aketi 🔗

Main Navigation

Affinity Posters

Tiny Papers Poster Session 7

Krystal Maughan · Thomas F Burns

Halle B

Schedule