Unveiling Bias & Stereotypes: Exploring Cultural Sensitivity in Language Models


Unveiling Bias & Stereotypes
Exploring Cultural Sensitivity in Language Models


Poster
Introduction

In the landscape of artificial intelligence, ensuring cultural sensitivity and inclusivity within language models is crucial as they become increasingly integrated into our lives. This project focuses on understanding and mitigating biases and stereotypes embedded within language models, particularly concerning Indian languages.

Understanding the Landscape: Language models like GPT have revolutionized natural language processing but are susceptible to biases in training data. Previous research has highlighted biases in Indian language prompts, laying the foundation for our exploration.

Previous Experiments: Building upon existing research, our project aims to delve deeper into biases and stereotypes perpetuated by language models. We refer to datasets used in prior studies for consistency.

Approach: Our approach combines meticulous research with practical experimentation. We refine data, develop tools, and address challenges encountered during experimentation.

Progress Snapshot:

  1. Data Refinement: Ensuring consistency in datasets used.
  2. Tool Development: Creation of a flexible Python notebook for generating questions and statements in multiple languages.
  3. Challenges Encountered: Translation difficulties and inconsistencies in model responses.
  4. Results Thus Far: Successful generation of prompt JSONs in English, Telugu, and Hindi. Sample responses compiled for benchmarking. Sentiment analysis conducted using the NRC dataset.

Data Categorisation:

For the prompts that we have generated, the data is split into multiple categories. Following is an illustration of the split.


    As we can see there are about 8 categories, and this resulted in generating 1398 prompts per language. We have done this experiment on English, Telugu and Hindi.


    Metrics and Reflections:

    Quantifying progress is crucial for evaluating efficacy. Since the responses are all keywords, an quantitative approach was difficult. In this experiment, the responses are mapped to NRC Lexicon dataset, and the sentiments are found out, and then plotted to analyse the number of positive & negative responses. 
     
    The following is a visual illustration of the number of positive and negative responses that we have received for each of the language.

    Individual results graph for English 

    Individual graph for Telugu
    Individual graph for Hindi



    In each of the graph we can see that the number of positive responses we received are far greater than the negative responses.

    Below are the word clouds of the predominant keywords that we were generated by ChatGPT 3.5

    English Word Cloud


    Telugu Word Cloud

    Hindi Word Cloud



    We have plotted the number of responses that are generated per category per language. The graph again re-affirms the point that there are more positive responses.

    This graph shows comparison between all 3 languages for "Age" category

    The graph shows comparison between all 3 languages for "Race" category



    Following are the sentiments of the various keywords per each category.More details on this can be found in the report as well.

    Language : English

    Consolidation of negative responses for each category :

     The graph shows comparison comparision of Negative responses for all 3 languages. 

    Project Video Explanation:

        A detailed video explanation provides insight into the project's methodology and progress.

    Moving Forward:

    Continued refinement of methodologies and collaboration with experts in linguistics, culture, and AI research is essential. The goal is to create more inclusive and culturally-aware AI systems.


    Some pictures from poster presentation day:
    PROJECT VIDEO EXPLAINATION Watch the complete Video to understand the approach taken in the project and outcomes.

    Moving Forward: The Road Ahead

    As we continue on this journey, our focus remains on refining our methodologies and addressing the challenges encountered. Collaborative efforts with linguists, cultural experts, and AI researchers are pivotal in unraveling the intricacies of cultural sensitivity in language models. Our ultimate goal is to pave the way for more inclusive and culturally-aware AI systems that reflect the diverse tapestry of humanity.
    Project Report:

    You can find a detailed report of the project here: Report, PDF  (https://drive.google.com/file/d/1OKmUQJBx8hLiZreW67obIEwWZFHYgK3J/view?usp=sharing)

    For those interested in delving deeper into our project, the prompt JSONs in English, Telugu, and Hindi can be accessed here :

    Probes: English, Telugu, Hindi

    Results : English, Telugu, Hindi

    Team Members:

    Narasimhan Ch - 2023900043

    V R K Pranav Gollakota - 2023900045

    Rachakonda Hrithik Sagar - 2023900021

    Sagar Deelip Dwale - 2023204010

    References:

    1. Busker, T., Choenni, S., & Shoae Bargh, M. (2023, September). Stereotypes in ChatGPT: An empirical study. In Proceedings of the 16th International Conference on Theory and Practice of Electronic Governance (pp. 24-32).

    2. Jha, A., Davani, A., Reddy, C. K., Dave, S., Prabhakaran, V., & Dev, S. (2023). SeeGULL: A stereotype benchmark with broad geo-cultural coverage leveraging generative models. arXiv preprint arXiv:2305.11840.




      



    Comments

    Popular posts from this blog

    Data when seen through the Inference Web of LLM

    EAR-VM: Exploring Methods for Improving Adversarial Robustness of Vision Models

    Exploring and Quantifying Bias in VLMs