Thu 7:30 a.m. - 9:30 a.m.

Policy Optimization in RLHF: The Impact of Out-of-preference Data ( Poster #306 ) > link

Poster Location: Halle B #306

Aligning agents with human preferences is important. This paper examines two types of alignment methods: Direct Preference Optimization (DPO) and Reward-Model-Based Policy Optimization (RMB-PO). A variant of RMB-PO, referred to as RMB-PO+ is also considered. These methods, either explicitly or implicitly, learn a reward model from preference data and differ in the data used for policy optimization to unlock the generalization ability of the reward model. In particular, compared with DPO, RMB-PO additionally uses policy-generated data, and RMB-PO+ further leverages new, preference-free data (i.e., prompts or so-called states). We examine the impact of such out-of-preference data through synthetic contextual bandit problems. Our study suggests that RMB-PO+ outperforms the other two approaches. In particular, even when providing the policy model with a good feature representation, we find that policy optimization with adequate out-of-preference data significantly improves performance by harnessing the reward model's generalization capabilities. We present an analysis based on stochastic approximation and relate our results with other research, including imitation learning and reinforcement learning.

Link

Ziniu Li · Tian Xu · Yang Yu 🔗

Thu 7:30 a.m. - 9:30 a.m.

Dissecting Zero-Shot Visual Reasoning Capabilities in Vision and Language Models ( Poster #305 ) > link

Poster Location: Halle B #305

Vision-language models (VLMs) have shown impressive zero- and few-shot performance on real-world visual question answering (VQA) benchmarks, alluding to their capabilities as visual reasoning engines. However, existing works (typically) use benchmarks that conflate “pure” visual reasoning with world knowledge, and also have questions that involve a limited number of reasoning steps. Thus, it remains unclear whether a VLM’s apparent visual reasoning performance is due to its world knowledge, or due to actual visual reasoning capabilities. To clarify this ambiguity, we systematically benchmark and dissect the zero-shot visual reasoning capabilities of VLMs through synthetic datasets that require minimal world knowledge, and allow for analysis over a broad range of reasoning steps. We specifically focus on evaluating the impact of conveying scene information as either visual embeddings or purely textual scene descriptions to the underlying large language model (LLM) of the VLM. We notably find that the underlying LLMs, when provided textual scene descriptions, consistently perform significantly better compared to being provided visual embeddings. Our work comprehensively identifies limitations of VLMs for compositional visual reasoning, and highlights the important role that LLMs can play in scene understanding and visual reasoning.

Link

Aishik Nagar · Shantanu Jaiswal · Cheston Tan 🔗

Thu 7:30 a.m. - 9:30 a.m.

Exploiting Time Channel Vulnerability of Learned Bloom Filters ( Poster #304 ) > link

Poster Location: Halle B #304

Neural network for computer systems—such as operating systems, databases, and network systems—attract much attention. However, using neural networks in systems introduces new attacking surfaces. This paper makes the first attempt to study the security factor of learned bloom filters, a promising neural network based data structure in systems. We design and implement an attack that can efficiently recover system owners’ data via a timing side channel and a new recovering algorithm.

Link

Harman Farwah · Gagandeep Singh · Cheng Tan 🔗

Thu 7:30 a.m. - 9:30 a.m.

Analog In-Memory Computing with Uncertainty Quantification for Efficient Edge-based Medical Imaging Segmentation ( Poster #303 ) > link

Poster Location: Halle B #303

This work investigates the role of the emerging Analog In-memory computing (AIMC) paradigm in enabling Medical AI analysis and improving the certainty of these models at the edge. It contrasts AIMC's efficiency with traditional digital computing's limitations in power, speed, and scalability. Our comprehensive evaluation focuses on brain tumor analysis, spleen segmentation, and nuclei detection. The study highlights the superior robustness of isotropic architectures, which exhibit a minimal accuracy drop (0.04) in analog-aware training, compared to significant drops (up to 0.15) in pyramidal structures. Additionally, the paper emphasizes IMC's effective data pipelining, reducing latency and increasing throughput as well as the exploitation of inherent noise within AIMC, strategically harnessed to augment model certainty.

Link

Imane Hamzaoui · Hadjer Benmeziane · Zayneb Cherif · Kaoutar El Maghraoui 🔗

Thu 7:30 a.m. - 9:30 a.m.

Toward Computationally Efficient Inverse Reinforcement Learning via Reward Shaping ( Poster #302 ) > link

Poster Location: Halle B #302

Inverse reinforcement learning (IRL) is computationally challenging, with common approaches requiring the solution of multiple reinforcement learning (RL) sub-problems. This work motivates the use of potential-based reward shaping to reduce the computational burden of each RL sub-problem. This work serves as a proof-of-concept and we hope will inspire future developments towards computationally efficient IRL.

Link

Lauren H. Cooke · Harvey Klyne · David Bell · Cassidy Laidlaw · Milind Tambe · Finale Doshi-Velez 🔗

Thu 7:30 a.m. - 9:30 a.m.

Software 1.0 Strengths for Interpretability and Data Efficiency ( Poster #301 ) > link

Poster Location: Halle B #301

Machine learning has demonstrated remarkable capabilities across various tasks, yet it confronts significant challenges such as limited interpretability, reliance on extensive data, and difficulties in incorporating human intuition. In contrast, traditional software development avoids these pitfalls, offering full interpretability, less data dependency, and easy integration of intuitive decision-making. To have the strengths of both approaches, we introduce the BasedOn library. This tool focuses on code written by programmers while providing very simple interfaces to let programmers use machine learning. The BasedOn library, leveraging policy gradient methods, offers "learnable" if statements.

Link

Maral Jabbarishiviari · Arshia Soltani Moakhar 🔗

Thu 7:30 a.m. - 9:30 a.m.

Enhancing Drug-Drug Interaction Prediction with Context-Aware Architecture ( Poster #300 ) > link

Poster Location: Halle B #300

In the field of disease treatment, the simultaneous use of multiple medications can lead to unforeseen adverse reactions, compromising patient safety and therapeutic efficacy. Consequently, predicting drug-drug interactions (DDIs) has emerged as a pivotal research focus to improve disease treatment. While recent advancements have been made in deep learning models for predicting drug pair relations, the nuanced consideration of individual or cellular conditions as influential contextual factors in DDIs is notably lacking. In this study, leveraging existing models, we introduce a methodology to predict DDIs through a context-aware architecture. The evident performance improvement compared to established methodologies underscores the crucial role of the context-aware mechanism in addressing context-conditional DDIs. Furthermore, we perform a systematic ablation analysis to assess the impact of model elements. Simultaneously, we also investigate the potential of incorporating pre-trained molecular representation learning models in this domain.

Link

Yijingxiu Lu · Yinhua Piao · Sun Kim 🔗

Thu 7:30 a.m. - 9:30 a.m.

A Shared Encoder for Multi-Source Hyperspectral Images ( Poster #299 ) > link

Poster Location: Halle B #299

Multi-source hyperspectral images(HSIs) which captured from diverse sensors commonly possess varying bands.When employing deep learning techniques for their processing, individual models are necessitated for each source due to the disparate dimensions.To tackle this problem, we propose a shared encoder to project all HSIs into a unified feature space.It establishes a general framework for the representation of multi-source HSIs, providing foundational conditions for the development of a universal HSI analysis model.

Link

Weili Kong · Baisen Liu · Xiaojun Bi · Jiaming Pei 🔗

Thu 7:30 a.m. - 9:30 a.m.

What Does a Visual Formal Analysis of the World's 500 Most Famous Paintings Tell Us About Multimodal LLMs? ( Poster #298 ) > link

Poster Location: Halle B #298

This work introduces ArtQA, a new benchmark for multimodal LLMs through the lens of formal analysis of paintings. We focus on key elements such as line, shape, space, color, form, value, and texture—collectively referred to as the elements of art in visual formal analysis. ArtQA contains questions spanning 4 metrics, further divided into 16 fine-grained categories. We leverage the power of LLMs to generate VQA questions based on formal analysis of 500 renowned paintings. These questions undergo a rigorous filtering process by both model annotation and human experts, ensuring ArtQA's quality and reliability.

Link

Muzi Tao · Saining Xie 🔗

-

Can LLMs Learn a New Language on the Fly? A Case Study on Zhuang ( Poster #297 ) > link

Poster Location: Halle B #297

Existing large language models still fail to support many low-resource languages. Especially for the extremely low-resource ones, there is hardly any training data to effectively update the model parameters. We thus investigate whether LLMs can learn a new language on the fly through in-context learning prompting. To study this question, we collect a research suite for Zhuang, a language supported by no LLMs currently. We study the performance of various LLMs on the Zhuang-Chinese translation task and find out the great potential of this learning paradigm.

Link

Chen Zhang · Mingxu Tao · Quzhe Huang · Zhibin Chen · Yansong Feng 🔗

-

Can Perplexity Reflect Large Language Model's Ability in Long Text Understanding? ( Poster #296 ) > link

Poster Location: Halle B #296

Recent studies have shown that Large Language Models (LLMs) have the potential to process extremely long text with evidence that LLMs could perform well in the language modeling task with even 1 million input tokens. When the input context length increases, the perplexity (PPL) of the model is observed to maintain at a low level or even decrease. However, in our study, we find that the PPL may only reflect the model's ability to model local information instead of catching long-range dependency, and thus only using PPL to prove the model could process very long context is not appropriate. The local focus feature of perplexity could also explain some existing phenomena, such as the great extrapolation ability of the position method ALiBi. When evaluating a model's ability in long text, we might pay more attention to the limitation of PPL and avoid overly reliance on it.

Link

Yutong Hu · Quzhe Huang · Mingxu Tao · Chen Zhang · Yansong Feng 🔗

-

Can Speculative Sampling Accelerate ReAct Without Compromising Reasoning Quality? ( poster ) > link

Poster Location: #295

Large language models (LLMs) are increasingly used as agents for interaction with external environments. These interplays are commonly facilitated through various prompting paradigms. However, such paradigms require extended interaction traces between the LLMs and the environment, resulting in low task-solving efficiency. In this work, we integrate speculative sampling (SpS) into the novel ReAct paradigm. In particular, we investigate speculative sampling’s impact on the efficiency of ReAct and the quality of reasoning tasks. Our evaluations using HotPotQA and FEVER datasets demonstrate that implementing speculative sampling alongside ReAct results in a 2.18x-2.62x acceleration compared to using ReAct alone, while only introducing a negligible impact on the reasoning abilities.

Link

Han Xu · Haipeng Chen 🔗

-

LLM Self Defense: By Self Examination, LLMs Know They Are Being Tricked ( poster ) > link

Poster Location: #294

Large language models (LLMs) are popular for high-quality text generation but can also produce harmful responses as adversarial prompts can bypass their safety measures. We propose LLM Self Defense, a simple approach to defend against these attacks by having an LLM screen the induced responses, thus not requiring any fine-tuning, input preprocessing, or iterative output generation. Instead, we incorporate the generated content into a pre-defined prompt and employ another instance of an LLM to analyze the text and predict whether it is harmful. Notably, LLM Self Defense succeeds in reducing the attack success rate to virtually 0 against various types of attacks on GPT 3.5 and Llama 2.

Link

Mansi Phute 🔗

Main Navigation

Affinity Posters

Tiny Papers Poster Session 6

Krystal Maughan · Thomas F Burns

Halle B

Schedule