DEFINING and DETECTING HALLUCINATIONS IN LARGE LANGUAGE MODELS

HALLUCINATIONS

Hallucinations occur when a model generates text that deviates from the input context or presents inaccurate information.

Example:

Input Context:
Prompt: "Describe the life cycle of a butterfly."

Hallucinated Text:
"The life cycle of a butterfly begins with a caterpillar hatching from an egg. Then, it spins a cocoon and transforms into a unicorn."

In this example, the hallucination occurs when the Language Model generates text that deviates from the factual information provided in the input context. While the first part of the generated text aligns with the prompt by describing the initial stages of a butterfly's life cycle, the mention of transforming into a unicorn is a hallucination—a fabricated detail not grounded in reality.

Hallucinations as we here define are:

A property of the training data.
Also a function of the input context.
Also a function of the model architecture, which we are exploring here.

Entailment : Response is in line with the reference.

Contradiction: Response contradicts the reference

Neutral: Cannot reach a conclusion, response is neutral.

Why Hallucinations Occur?

Hallucinations in Language Models can happen due to biases in training data, ambiguity in input context, limitations in understanding complex scenarios, model architecture, and the complexity of language itself.

Challenges & Motivation:

In the dynamic realm of Language Models (LLMs), the phenomenon of hallucinations poses a significant challenge. Addressing this issue is vital for enhancing the credibility and dependability of LLM's across diverse applications. Ensuring the accuracy and reliability of LLMs is critical in the ever-evolving landscape of natural language processing. Hallucinations can propagate misinformation, leading to serious repercussions. Detecting and mitigating these hallucinations is imperative to fortify the trustworthiness and utility of LLMs.

HALLUCINATION DETECTION PIPELINE

Exploring Knowledge Graphs (KG's) :

Knowledge Graphs (KGs) serve as powerful frameworks for representing complex information in a structured and easily navigable format. At their core, KGs aim to organize knowledge in a manner that facilitates efficient retrieval and comprehension.

The Essence of Triplets:

Within the domain of KGs, the concept of triplets emerges as a cornerstone. These triplets, composed of subject-predicate-object structures, offer a concise and intuitive representation of relationships between entities. Think of them as the building blocks of knowledge, encapsulating factual units in a digestible format.

Inspired by Knowledge Graph Studies:

The genesis of the triplet approach finds its roots in the realm of knowledge graph studies. Here, researchers recognize the importance of granularity in representing knowledge. Triplets provide a granular view, allowing us to capture nuanced relationships and intricate details within the knowledge domain.

Purpose of Triplets Approach:

Triplets aren't just about simplifying complex information; they serve a deeper purpose. By breaking down knowledge into its fundamental components, we enhance our ability to analyze, interpret, and leverage it effectively. In essence, triplets empower us to unlock the full potential of knowledge graphs, facilitating deeper insights and informed decision-making.

Dataset Creation:

NQ Dataset

We commenced by sampling 100 challenging questions from NQ(Natural Questions) Dataset, a goldmine of diverse queries. Each question underwent scrutiny by our suite of LLMs, including Alpaca-7B, ChatGPT, Claude-2, Falcon-40B, GPT-4. Leveraging the Claude API, we queried the Claude-2 model for each question-answer pair generating a list of triplets. These triplets were meticulously annotated, classifying them as either entailing, contradicting, or remaining neutral concerning the ground truths.

Utilizing NLI Models:

Initially, pre-trained Natural Language Inference (NLI) models were enlisted to ascertain the veracity of the triplets. By examining the entailment of triplets within the provided contexts, we gauged their truthfulness. Employing RoBERTa-Large, a powerful NLI model, we conducted a comprehensive evaluation of each triplet's alignment with its context, thereby establishing a baseline for further analysis.

Transitioning to LLMs:

To shift the paradigm from NLI-based evaluation to a more direct reasoning approach, we entrusted the LLMs themselves with discerning the correctness of the knowledge graph. Claude-3 Sonnet emerged as our prime choice for this task, owing to its exemplary performance and efficiency. Through iterative refinement of prompting strategies, we crafted a tailored prompt to solicit direct feedback from the LLM regarding the validity of the provided knowledge. This pivotal transition not only streamlined the evaluation process but also empowered us to delve deeper into the phenomenon of hallucination within LLMs. The prompt used for it was:

Results:

For the human annotated data on the triplets with respect to context:
Results on the triplets by using an NLI on different models:
Results on the major label on whether it is a hallucination or not by using an NLI on different models:
Results on multiple new models using NLI:
Comparision of results on models using NLI against using an LLM instead of NLI:

Analysis:

The analysis of our results sheds light on the effectiveness of predicting hallucinations within language models (LLMs) given contextual information. Initially, Mistral demonstrates superior performance compared to other models, attributed to its higher parameter count. However, Falcon and GPT4 emerge as frontrunners, surpassing all other models, largely due to their substantial parameter sizes. This correlation between model performance and parameter count underscores the pivotal role of model size in achieving superior results.

Moreover, our comparison between LLMs and Natural Language Inference (NLI) models reveals a notable advantage of LLMs in hallucination detection. While NLI models struggle with triplets due to their training on full sentences, LLMs excel in this domain, leveraging their extensive training on next-word prediction tasks and augmented parameter capacity.

Conclusion:

In conclusion, our study focused on the critical task of detecting hallucinations in LLMs, leveraging a dataset of triplets sourced from challenging questions in the NQ dataset. Through comprehensive evaluation using both pre-trained NLI models and LLMs, we have elucidated the strengths of LLMs in this domain. Our findings underscore the potential of LLMs to enhance the accuracy and reliability of text generation by mitigating hallucination occurrences. Moving forward, further research endeavors could delve into advanced methodologies for detecting and addressing hallucinations, thereby bolstering the trustworthiness and applicability of LLMs across diverse domains.

YOUTUBE LINK: Video

REPORT : RSAI_Project_Final_Report.pdf

GITHUB LINK : RSAI_Project

Search This Blog

RSAI Projects - To be updated later

Defining and Detecting Hallucinations in LLM's