The power of ChatGPT

7 min read

Introduction

ChatGBT is an advanced conversational AI powered by the GPT-3.5 architecture. One of its remarkable features is its ability to detect and respond to human emotions. By analyzing text inputs and understanding the underlying sentiment, ChatGBT can engage in more empathetic and personalized conversations.

Emotion detection is a crucial aspect of effective communication, and ChatGBT excels in this area. Through its extensive training on diverse datasets, it has acquired an understanding of emotional nuances and can accurately interpret the sentiment expressed by users. Whether it's happiness, sadness, anger, or excitement, ChatGBT can pick up on these emotions and tailor its responses accordingly.

When a user interacts with ChatGBT, the model pays close attention to the language used, the choice of words, and the overall context. By analyzing these elements, it can deduce the emotional state of the user and respond in a manner that acknowledges and validates those emotions. This empathetic approach creates a more human-like and engaging conversation experience.

Emotion detection in ChatGBT also extends to recognizing and responding to complex emotional states. It can identify subtle emotional cues, such as sarcasm, irony, or humor, and adjust its responses accordingly. This ability enhances the conversational flow, ensuring that ChatGBT's replies align with the intended emotional tone and intention of the user.

How to Use Jupyter with ChatGPT

Step 1: Install ChatGPT

To connect with ChatGPT (formerly GPT-3.5) in your Jupyter Notebook, you can use the OpenAI Python library called 'openai'.

Install the openai library by running the following command in your Jupyter Notebook or terminal, then Import the necessary libraries in your Python code:

!pip install openai

import openai

Step 2: Set up your OpenAI

1 - Set up your OpenAI API key by assigning it to the openai.api_key variable. You can obtain your API key from the OpenAI platform. Make sure to keep your API key secure and avoid sharing it publicly:

openai.api_key = 'YOUR_API_KEY'

2 - Define a function that interacts with ChatGPT. You'll pass a prompt to the model and receive a response. Here's a simple example:

def get_completion(prompt, model="gpt-3.5-turbo"):    
     messages = [{"role": "user", "content": prompt}]    
     response = openai.ChatCompletion.create(        
                    model=model,        
                    messages=messages,        
                    temperature=0, # this is the degree of randomness of the model's output    
               )    
     return response.choices[0].message["content"]

This helper function will make it easier to use prompts and look at the generated outputs, for more information: chat completions endpoint.

Sentiment analysis

In our sentiment analysis endeavor, we utilize a dataset obtained from Kaggle, a renowned platform for data science enthusiasts and professionals. Specifically, we draw upon the dataset accessible through the following link: "https://www.kaggle.com/datasets/kazanova/sentiment140".

This dataset contains 1,600,000 tweets extracted using the twitter API. The tweets have been annotated (0 = negative, 4 = positive) and they can be used to detect sentiment .

I- Visualizing Data with Pandas DataFrame

We will exclusively utilize a random sample of 100 tweets from this dataset to assess the efficacy of ChatGPT in detecting sentiments. This limited subset allows us to evaluate the capabilities of ChatGPT while effectively capturing the diversity of sentiments expressed in the larger dataset

import pandas as pd

df = pd.read_csv("Tweets_Kaggle.csv", names=['target', 'id', 'date', 'flag', 'user', 'tag'], encoding ='latin-1')

new = pd.concat([df[0: 800000].sample(50), df[800000: ].sample(50)]).sample(100).reset_index()

new['target'] = new['target'].replace([0, 4], ['negative', 'positive'])

new = new[['tag', 'target']]

display(new.sample(10))

Here is a sample of 10 tweets

index	tag	target
97	soccer game today!! my team won!! yay!! i was ...	negative
87	@WerewolfSeth Sexth! Why did you change your p...	negative
17	@Faulsey nice day to be in town, wish I wasn't...	negative
20	@thelazzyone lol	positive
85	off to bed goodnight dears	positive
86	5 more hours in bflo	negative
41	@mileycyrus http://twitpic.com/7f5fy - This pi...	positive
6	This is a bad BAD Cubs team. Breaks my cold l...	negative
49	Gettin ready to head to church. then back to ...	positive
98	Going to manchester with rosie today, she want...	negative

II- Prediction using ChatGPT

The code below segment demonstrates a sentiment analysis process using a DataFrame called 'new'. It involves iterating through each row of the DataFrame and extracting the 'tag' value as 'text'. A new DataFrame, 'new_data', is created to include the rows up to the current iteration.

The code then performs sentiment detection by constructing a prompt with the 'text' enclosed in triple quotes. The sentiment analysis model predicts a single-word sentiment ('negative' or 'positive'), which is stored in the 'list_of_prediction' list. The 'new_data' DataFrame is updated with the latest sentiment predictions.

The code displays the 'new_data' DataFrame and prints a cross-tabulation table using 'pd.crosstab()' to analyze the relationship between the target sentiment and the predicted sentiment values.

A 20-second pause using 'time.sleep(20)' is added between iterations to manage the processing pace "because be use -gpt- 3.5-turbo with 3 requests per minute"

This code segment allows for sentiment analysis on the 'tag' data within the 'new' DataFrame, providing insights into the relationship between predicted and target sentiment values.

list_of_prediction = []

for i in range(len(new)):        
     text = new.loc[i]['tag']    
     new_data = new.loc[0:i]
     
     #---------------Detecting sentiment---------------#        
     prompt = f""" 
               You will be provided with text delimited by triple quotes.
               you will return the sentiment of this text \     
               limit your answer only by a single word:
               'negative' or 'positive'

               \"\"\"{text}\"\"\"                
               """            
     response = get_completion(prompt)    
     list_of_prediction.append(str(response).lower())    
     new_data['prediction'] = list_of_prediction        
     display(new_data)    
     print(pd.crosstab(new_data.target, new_data.prediction))        
     time.sleep(20)

The code below performs cross-tabulation using the pandas library on a DataFrame called "new_data". It computes the distribution of predictions across different target categories.

The resulting table, named "results," displays the normalized values for each row. The code then uses the "display()" function to show the table in a visually appealing format.

results = pd.crosstab(new_data.target, new_data.prediction, normalize='index')

display(results)

prediction	negative	neutral	positive
target
negative	0.68	0.04	0.28
positive	0.12	0.06	0.82

Looking at the values in the table:

For the "negative" target category, the model predicts it as negative with a probability of 0.68, as neutral with a probability of 0.04, and as positive with a probability of 0.28.
For the "positive" target category, the model predicts it as negative with a probability of 0.12, as neutral with a probability of 0.06, and as positive with a probability of 0.82.

Let's show the percentage of good predictions by ChatGPT globally.

result2 = pd.crosstab(new_data.target, new_data.prediction, normalize='all')

display(result2)

prediction	negative	neutral	positive
target
negative	0.34	0.02	0.14
positive	0.06	0.03	0.41

We can deduce that ChatGPT has a 75% accuracy rate in total predictions.

In conclusion, ChatGPT demonstrates a commendable ability to detect positive and negative sentiments in tweets, achieving a 75% success rate. This suggests its effectiveness in understanding and classifying sentiment in textual data, which can be valuable for various applications such as sentiment analysis and opinion mining. However, further evaluation and analysis are necessary to assess its performance across different datasets and domains.