Posts

Enhancing Machine Learning Interpretability: Tackling Semantic Uncertainties and Hallucinations

Image
Introduction: In the evolving field of machine learning, particularly in natural language generation (NLG), semantic uncertainties and hallucinations present significant challenges. These issues can distort the reliability of model outputs, leading to misinformation and eroding user trust. This blog explores the origins and impacts of these phenomena, introduces innovative metrics for their detection, and discusses strategies for improving model accuracy. Understanding Semantic Uncertainties: Semantic uncertainties occur when a model generates multiple plausible outputs for the same input, reflecting the inherent ambiguity in human language. The Team Llama, comprising Yash Shivhare, Arush Sachdeva, and Vrinda Agarwal, has explored this phenomenon extensively, suggesting innovative approaches for uncertainty estimation: 1.Metrics for Uncertainty Estimation: ROUGE Scores : These assess the overlap between generated text and reference texts, helping gauge the quality of generated content....

Unveiling Bias & Stereotypes: Exploring Cultural Sensitivity in Language Models

Image
Unveiling Bias & Stereotypes Exploring Cultural Sensitivity in Language Models Poster Introduction In the landscape of artificial intelligence, ensuring cultural sensitivity and inclusivity within language models is crucial as they become increasingly integrated into our lives. This project focuses on understanding and mitigating biases and stereotypes embedded within language models, particularly concerning Indian languages. Understanding the Landscape: Language models like GPT have revolutionized natural language processing but are susceptible to biases in training data. Previous research has highlighted biases in Indian language prompts, laying the foundation for our exploration. Previous Experiments: Building upon existing research, our project aims to delve deeper into biases and stereotypes perpetuated by language models. We refer to datasets used in prior studies for consistency. Approach: O...

Steering Large Language Models Towards Truthful and Reliable Outputs with Representation Engineering

Image
In today's era of rapidly advancing artificial intelligence, large language models (LLMs) like ChatGPT, Gemini, and Claude have emerged as incredibly capable systems. They can engage in substantive conversations, generate creative content, answer complex queries, and even code software from natural language prompts. However, as LLMs become increasingly sophisticated and knowledgeable, an important question arises: How truthful and reliable are their outputs? Despite demonstrating impressive performance across a wide range of tasks, these models currently lack well-defined benchmarks and safeguards to evaluate and ensure truthfulness. The Troubling Tradeoff of Untruthfulness Our team used an intensely curated dataset of 817 questions from 38 different categories that the OpenAI researchers had made public to look at this problem. A troubling hypothesis was confirmed after putting data sets through various models: language models' propensity to produce measurable losses in truth...