RSAI Projects - To be updated later

Posts

What's Your Number? Interpreting Memorisation in Language Models

May 06, 2024

In the vast realm of machine learning, the concept of "grokking" holds a special allure. It refers to a model's ability to truly understand and generalize beyond mere pattern recognition or memorization. Achieving grokking is a hallmark of true intelligence, where a model can accurately predict or classify data while developing an intuitive grasp of the underlying patterns, relationships, and representations within that data. Our Approach As part of our course project, our team set out to explore the parameters and conditions necessary for grokking to occur in machine learning models. We focused our efforts on task complexity, data quantity, hyperparameters, and model architecture, using a character-level decoder-only transformer architecture as our testing ground. First Steps Initially, we tackled a simple ROT13 cipher task, which maps each letter to the 13th letter after it in the alphabet. However, this task proved too straightforward, and even a single-layered small...

Variegated Machine Unlearning - KowalskiAnalysis

May 06, 2024

Poster Presentation Day What is ‘Machine Unlearning’? In the age of machine learning, with the widespread use and abuse of data, to tackle these challenges, the notion of machine unlearning comes into play, with a variety of uses in privacy, poison removal, etc. What is ‘Data Poisoning’? Data poisoning is a type of cyber-attack in which an adversary intentionally compromises a training dataset used by an AI or machine learning (ML) model to influence or manipulate the operation of that model. What is the motivation? We take inspiration from the Corrective Machine Unlearning paper by Goel, et. al, to explore some interesting cases with performance implications for machine unlearning. What did we do? We explored multiple problems: 1. Machine Unlearning of Poisons over Imbalanced Datasets: Given the inherent disparity in representation of different classes over imbalanced datasets, we hypothesize that the impact of machine unlearning of poisons over such datasets should ...

Defining and Detecting Hallucinations in LLM's

May 06, 2024

` DEFINING and DETECTING HALLUCINATIONS IN LARGE LANGUAGE MODELS

EAR-VM: Exploring Methods for Improving Adversarial Robustness of Vision Models

May 06, 2024

Abstract CNNs have many uses, particularly in the field of Computer Vision, however, their vulnerability to Adversarial attacks leaves a lot to be desired, particularly to their robustness to these kinds of attacks in order to make them more safe. The misuse of adversarial attacks are a major threat to CNN vision models; for example, self-driving cars can be made to misinterpret road signs or signals, putting the passengers at significant risk of harm. To address this vulnerability, we have attempted to modify the architecture of a CNN to add an auxiliary classification SVM which will determine the maximum margin in which these adversarial attacks will impact loss. Also, interpretability is a key concept in understanding the decisions and outputs of modern networks. By interpreting the working of the model, we can curate better adversarial attacks, or make the model more robust. Objectives Our goal is to first implement the At-SVM, and then to test it to...

Search This Blog

RSAI Projects - To be updated later

Posts

Data when seen through the Inference Web of LLM

What's Your Number? Interpreting Memorisation in Language Models

Variegated Machine Unlearning - KowalskiAnalysis

Defining and Detecting Hallucinations in LLM's

EAR-VM: Exploring Methods for Improving Adversarial Robustness of Vision Models