Poster
in
Workshop: Building Trust in LLMs and LLM Applications: From Guardrails to Explainability to Regulation

Differentially Private Retrieval Augmented Generation with Random Projection

Dixi Yao · Tian Li

Project Page [ OpenReview]

Abstract

Large Language Models (LLMs) have gained widespread interest and driven advancements across various fields. Retrieval-Augmented Generation (RAG) enables LLMs to incorporate domain-specific knowledge without retraining. However, evidence shows that RAG poses significant risks of knowledge datastore leakage. This paper proposes a differentially private random projection mechanism to map the datastore onto a lower-dimensional space, mitigating retrieval-based attacks. We establish a theoretical connection between privacy budgets and the projection process, offering a rigorous privacy guarantee. Empirical evaluations demonstrate that our solution achieves superior privacy protection with minimal leakage and a tight privacy bound of $\epsilon=5$, all with negligible impact on generation performance and latency compared to prior methods.

Chat is not available.