Defining and Detecting Hallucinations in LLM's
DEFINING and DETECTING HALLUCINATIONS IN LARGE LANGUAGE MODELS
HALLUCINATIONS
Input Context:
Prompt: "Describe the life cycle of a butterfly."
Hallucinated Text:
"The life cycle of a butterfly begins with a caterpillar hatching from an egg. Then, it spins a cocoon and transforms into a unicorn."
In this example, the hallucination occurs when the Language Model generates text that deviates from the factual information provided in the input context. While the first part of the generated text aligns with the prompt by describing the initial stages of a butterfly's life cycle, the mention of transforming into a unicorn is a hallucination—a fabricated detail not grounded in reality.
Hallucinations as we here define are:
- A property of the training data.
- Also a function of the input context.
- Also a function of the model architecture, which we are exploring here.
Why Hallucinations Occur?
Hallucinations in Language Models can happen due to biases in training data, ambiguity in input context, limitations in understanding complex scenarios, model architecture, and the complexity of language itself.
Challenges & Motivation:
In the dynamic realm of Language Models (LLMs), the phenomenon of hallucinations poses a significant challenge. Addressing this issue is vital for enhancing the credibility and dependability of LLM's across diverse applications. Ensuring the accuracy and reliability of LLMs is critical in the ever-evolving landscape of natural language processing. Hallucinations can propagate misinformation, leading to serious repercussions. Detecting and mitigating these hallucinations is imperative to fortify the trustworthiness and utility of LLMs.
HALLUCINATION DETECTION PIPELINE
Exploring Knowledge Graphs (KG's) :
- Knowledge Graphs (KGs) serve as powerful frameworks for representing complex information in a structured and easily navigable format. At their core, KGs aim to organize knowledge in a manner that facilitates efficient retrieval and comprehension.
- Within the domain of KGs, the concept of triplets emerges as a cornerstone. These triplets, composed of subject-predicate-object structures, offer a concise and intuitive representation of relationships between entities. Think of them as the building blocks of knowledge, encapsulating factual units in a digestible format.
Inspired by Knowledge Graph Studies:
- The genesis of the triplet approach finds its roots in the realm of knowledge graph studies. Here, researchers recognize the importance of granularity in representing knowledge. Triplets provide a granular view, allowing us to capture nuanced relationships and intricate details within the knowledge domain.
Purpose of Triplets Approach:
- Triplets aren't just about simplifying complex information; they serve a deeper purpose. By breaking down knowledge into its fundamental components, we enhance our ability to analyze, interpret, and leverage it effectively. In essence, triplets empower us to unlock the full potential of knowledge graphs, facilitating deeper insights and informed decision-making.
Dataset Creation:
NQ Dataset
- We commenced by sampling 100 challenging questions from NQ(Natural Questions) Dataset, a goldmine of diverse queries. Each question underwent scrutiny by our suite of LLMs, including Alpaca-7B, ChatGPT, Claude-2, Falcon-40B, GPT-4. Leveraging the Claude API, we queried the Claude-2 model for each question-answer pair generating a list of triplets. These triplets were meticulously annotated, classifying them as either entailing, contradicting, or remaining neutral concerning the ground truths.
Utilizing NLI Models:
- Initially, pre-trained Natural Language Inference (NLI) models were enlisted to ascertain the veracity of the triplets. By examining the entailment of triplets within the provided contexts, we gauged their truthfulness. Employing RoBERTa-Large, a powerful NLI model, we conducted a comprehensive evaluation of each triplet's alignment with its context, thereby establishing a baseline for further analysis.
Transitioning to LLMs:
- To shift the paradigm from NLI-based evaluation to a more direct reasoning approach, we entrusted the LLMs themselves with discerning the correctness of the knowledge graph. Claude-3 Sonnet emerged as our prime choice for this task, owing to its exemplary performance and efficiency. Through iterative refinement of prompting strategies, we crafted a tailored prompt to solicit direct feedback from the LLM regarding the validity of the provided knowledge. This pivotal transition not only streamlined the evaluation process but also empowered us to delve deeper into the phenomenon of hallucination within LLMs. The prompt used for it was:
Results:
- For the human annotated data on the triplets with respect to context:
- Results on the triplets by using an NLI on different models:
- Results on the major label on whether it is a hallucination or not by using an NLI on different models:
- Results on multiple new models using NLI:
- Comparision of results on models using NLI against using an LLM instead of NLI:




Comments
Post a Comment