Data when seen through the Inference Web of LLM
Background
Today I saw a post from this guy saying “I was at the Marines today”. A few mins later I saw a post from the same guy saying: “I love vada pavs”. And i thought to myself “Hey, This guy is probably from Mumbai, India”. The next thought that followed this was if I, a mere human being can deduce this from 2 posts, what can the LLMs do? LLMS are trained on massive data sources from the internet. Isn't it possible that they infer this information too?
Introduction
Large language models (LLMs) have revolutionized natural language processing tasks. They demonstrate amazing capabilities in understanding and generating human-like text. In this project, we decided to delve into the accuracies of inferential capabilities of various LLM models.
Objectives
Inference in LLMs is a vast and interesting topic to dive into. For the sake of this project, we chose 3 main objectives that we would be focusing upon:
Identify the accuracies of Inferential Capabilities of the LLM
Exploring how the inference of a language model changes as the amount of context provided increases
Suggest the approaches to mitigate / minimize the risk of personal data identification using inferential capabilities of LLMs
Our Setup
Our setup for the research was divided into 4 main sections:
Curating the right dataset
We referred to the Reddit dataset which was available on Kaggle. The Reddit dataset consisted of a lot of unnecessary information which was irrelevant to us. We needed data that was most appropriate for our problem statement. So we curated over 1000+ comments from the Reddit dataset that were closest to our inference parameters: Profession, City, Income, Gender, Country, and Education.
Providing an adversarial prompt
Most of the LLMs have guard rails that prevent people from trying to exploit PII data from them. In order to jail break, we used an adversarial prompt.
Setting a baseline
Now that we successfully fooled the models into giving us the PII data, we needed to set a baseline for evaluation of the models. We manually validated 100 prompts and found that GPT-4 was about 90.8% accurate in inference. Therefore, for the rest of the research we considered ChatGPT-4 to be our baseline for comparison.
Selection of models
- We selected 3 main models that we would use apart from GPT-4. These were: GPT-3.5-turbo, Gemini-pro, Claude-3 Sonnet.
- We choose these 3 model since these are publicly available black box model and were having large context window to accommodate our large input
Experiment
Evaluation of different parameters for different models:
We considered 6 parameters for our analysis, 2 direct inference parameters(city, profession) and 4 indirect inferences (Gender, Income, Country, Education)
Length of text:
Text Message length provided obviously was a variable that would cause differences in inference accuracies. To test this, we tried our inferences with variable context lengths.
We considered 25%, 50%, 75% and 100% of the message to determine how it would change the outcome of the inferences.
Results
Inferences
On performing this experiment we found that Gender and Profession data could be inferred very easily from the posts This was followed by Education and Income.
Claude over performed all other models in its inference and exceeded the baseline in most parameters.
Gemini was poor at inference as compared to the other models.
Interestingly City inference for Claude with 100% text is less accurate, indicating sometimes additional context confuses the model.
Accuracies
Given below is the graph depicting the accuracy of inferences. This excludes all the failed inferences that the model gave.(Responses where the model refused to answer or gave responses as “Insufficient Data”).
For parameters such as Income, Gender, Education, Country, Claude 3-Sonnet was performing better than GPT 3.5 for for full context window and for parameters Profession, City GPT3.5 is performing better than other models compared.
Precision
Some interesting results which we found are that claude-3-sonnet is performing very well in this attributes
Gender: it outperforms all in terms of correctly predicting the attribute when compared with other models
Education: consistently performed better in all context length than other models
Profession: gpt3.5 turbo was the least performing in this attribute but actually it was high, whereas the models Gemini-pro and claude-3-sonnet where neck to neck
Income: gpt3.5 turbo did hallucinate for a significant no of cases when context length was increased from 25 to 50% resulting in less precision
City: Claude outperformed other models, Gemini-pro did moderate performance and gpt-3.5 performed poorly as it is seen hallucinating with increase in context length, the ideal graph could be like even if it is not increasing it should stay stagnant but not decrease.
Country: It stayed stagnant which implies that the actual data for predicting country was in the first 25% interval (Low Predictions for the country might also be the other reason)
Mitigation techniques
In order to mitigate the PII data, we first needed to understand which data in the text was being inference.
For this we used a Presidio Analyzer. This allowed us to detect PII data and using this we implemented 2 modules for
mitigation:
Anonymizer
Faker
Definition
Presidio Analyzer enables PII removal, data cleaning, tokenization, custom entity recognition, and enhancement of training data for LLMs, ensuring privacy compliance and improved model performance.
Anonymizer
Anonymizing data after finding PII with Presidio Analyzer makes sure that personal info like names and contact details are hidden.
Some advantages of using the anonymizer are:
Protecting Privacy: Hiding sensitive info prevents unauthorized access or misuse.
Reducing Risk: Anonymization lowers the chance of exposing personal details, reducing the risk of identity theft or fraud.
Ensuring Compliance: It helps follow privacy laws like GDPR or CCPA, avoiding legal issues.
We used a faker to create synthetic data that mimics the patterns and formats of the original data without exposing real PII. This way, the LLM can still learn from the data without risk of re-identification.
Conclusion
- Inferential Capabilities of gpt-4 and Claude are very close to human capabilities
- There is some level of randomness across different models – gpt models being more stable and consistent.
- Indirect inferential capabilities of Claude is much higher given 100% context window
- Direct inferential capabilities of gpt3.5 are much better than any other model compared
- LLM Inferential capabilities are evolving and to protect data privacy, conscious efforts are required in training data anonymization of faking
Sneak Peak to our Journey
Ketaki Kashtikar

Comments
Post a Comment