Poster
in
Workshop: Quantify Uncertainty and Hallucination in Foundation Models: The Next Frontier in Reliable AI
TINY: Semantic-based Uncertainty Quantification in LLMS: A Case Study on Medical Explanation Generation Task.
Nicholas Kian Boon Tan · Mehul Motani
Keywords: [ Generative Models ] [ Natural Language Generation ] [ Large Language Models ] [ Explanation Generation ] [ Interpretability ] [ Uncertainty Quantification ] [ trustworthy AI ]
Given the often sensible and sometimes nonsensical outputs that modern Large Language Models (LLMs) generate, how should we interpret confident claims such as "Strawberry has two 'r's"? One tool that can be applied to such overconfident and hallucinatory claims is uncertainty quantification. In particular, this paper investigates the recently proposed semantic density framework to quantify uncertainty in LLM-generated medical explanations. Semantic density makes use of semantic similarity comparisons instead of lexical matching, and delivers per-response estimates of uncertainty. The results demonstrate that the semantic density framework remains performant when applied in specialized domains, and raises additional considerations around the utility of the ROUGE metric for semantic evaluations.