Skip to yearly menu bar Skip to main content


Poster

Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection

Akari Asai · Zeqiu Wu · Yizhong Wang · Avi Sil · Hannaneh Hajishirzi

Halle B #63
[ ] [ Project Page ]
Tue 7 May 7:30 a.m. PDT — 9:30 a.m. PDT
 
Oral presentation: Oral 2A
Tue 7 May 6:45 a.m. PDT — 7:30 a.m. PDT

Abstract:

Despite their remarkable capabilities, large language models (LLMs) often produce responses containing factual inaccuracies due to their sole reliance on the parametric knowledge they encapsulate. Retrieval-Augmented Generation (RAG), an ad hoc approach that augments LMs with retrieval of relevant knowledge, decreases such issues. However, indiscriminately retrieving and incorporating a fixed number of retrieved passages, regardless of whether retrieval is necessary, or passages are relevant, diminishes LM versatility or can lead to unhelpful response generation. We introduce a new framework called Self-Reflective Retrieval-Augmented Generation (Self-RAG) that enhances an LM's quality and factuality through retrieval and self-reflection. Our framework trains a single arbitrary LM that adaptively retrieves passages on-demand, and generates and reflects on retrieved passages and its generations using special tokens, called {\it reflection} tokens. Generating reflection tokens makes the LM controllable during the inference phase, enabling it to tailor its behavior to diverse task requirements. Experiments show that Self-RAG (7B and 13B parameters) significantly outperforms state-of-the-art LLMs and retrieval-augmented models on a diverse set of tasks. Specifically, Self-RAG outperforms ChatGPT and retrieval-augmented Llama2-chat on Open-domain QA, reasoning, and fact verification tasks, and it shows significant gains in improving factuality and citation accuracy for long-form generations relative to these models. Our code and trained models are available at https://selfrag.github.io/

Live content is unavailable. Log in and register to view live content