Twitter Sentiment Analysis
Getting data from twitter
- Using Tweepy (uses official twitter API)
- easy
- several limitations
- check https://developer.twitter.com/en/docs/rate-limits
- Using Twint (unofficial)
- a little difficult
- less limitation
- amazing for large volume
Analysis
- Vader sentiment using python package https://towardsdatascience.com/sentimental-analysis-using-vader-a3415fef7664
- Naive bayes for sentiment analysis (a little of DIY) https://www.datacamp.com/community/tutorials/simplifying-sentiment-analysis-python
- Spacy package
- Gensim package
- etc.
Installation
Clone the repo
git clone https://github.com/cloudfactory/sentiment_analysis_twitter_starter_code
CD into the cloned directory and create a virtualenv
python -m venv env
Enable virtualenv
source env/bin/activate
Install dependency packages from requirements.txt
pip install -r requirements.txt
Oper jupyter lab session
jupyter-lab
A simple twitter sentiment analysis poc
import tweepy
import json
from tweepy import OAuthHandler
import pandas as pd
Full documentation here: https://docs.tweepy.org/en/stable/client.html#tweets
Access keys
Apply at twitter developer and receive these:
API_SECRET_KEY = "fill this in"
API_KEY = "fill this in"
ACCESS_TOKEN = "fill this in"
ACCESS_TOKEN_SECRET = "fill this in"
class TwitterClient(object):
'''
Twitter Client
'''
def __init__(self):
'''
Class constructor or initialization method.
'''
# read keys from the secret credentials file
api_key = API_KEY
api_secret =API_SECRET_KEY
access_token = ACCESS_TOKEN
access_token_secret = ACCESS_TOKEN_SECRET
try:
self.auth = OAuthHandler(api_key, api_secret)
self.auth.set_access_token(access_token,
access_token_secret)
self.api = tweepy.API(self.auth)
except:
print('Error: Authentication error')
def get_tweets(self):
tweet = self.api.user_timeline(screen_name ='kathmandupost', count=20)
return tweet
raw = TwitterClient().get_tweets()
df = pd.json_normalize([r._json for r in raw])
df.head()
created_at | id | id_str | text | truncated | source | in_reply_to_status_id | in_reply_to_status_id_str | in_reply_to_user_id | in_reply_to_user_id_str | ... | user.profile_text_color | user.profile_use_background_image | user.has_extended_profile | user.default_profile | user.default_profile_image | user.following | user.follow_request_sent | user.notifications | user.translator_type | user.withheld_in_countries | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | Tue Oct 26 01:20:38 +0000 2021 | 1452807348309868564 | 1452807348309868564 | EDITORIAL: Protect paddy farmers\n\nEase of ac... | True | <a href="https://about.twitter.com/products/tw... | None | None | None | None | ... | 333333 | True | False | False | False | False | False | False | none | [] |
1 | Tue Oct 26 00:45:00 +0000 2021 | 1452798380527161348 | 1452798380527161348 | Despite some opposition, Oli appears to have p... | True | <a href="https://about.twitter.com/products/tw... | None | None | None | None | ... | 333333 | True | False | False | False | False | False | False | none | [] |
2 | Mon Oct 25 23:15:00 +0000 2021 | 1452775731050782726 | 1452775731050782726 | Over 100,000 doses of Pfizer-BioNtech vaccine ... | True | <a href="https://about.twitter.com/products/tw... | None | None | None | None | ... | 333333 | True | False | False | False | False | False | False | none | [] |
3 | Mon Oct 25 21:45:00 +0000 2021 | 1452753082111209475 | 1452753082111209475 | Congress may appoint deputy Speaker, leaving s... | True | <a href="https://about.twitter.com/products/tw... | None | None | None | None | ... | 333333 | True | False | False | False | False | False | False | none | [] |
4 | Mon Oct 25 20:15:00 +0000 2021 | 1452730433033015296 | 1452730433033015296 | Everything you need to know about the Covid-19... | True | <a href="https://about.twitter.com/products/tw... | None | None | None | None | ... | 333333 | True | False | False | False | False | False | False | none | [] |
5 rows × 70 columns
df.text
0 EDITORIAL: Protect paddy farmers\n\nEase of ac...
1 Despite some opposition, Oli appears to have p...
2 Over 100,000 doses of Pfizer-BioNtech vaccine ...
3 Congress may appoint deputy Speaker, leaving s...
4 Everything you need to know about the Covid-19...
5 United States to provide 100,620 doses of Pfiz...
6 Paddy damage by freak rains estimated at Rs8.2...
7 Dalit representatives complain of social discr...
8 Consult Delhi for census in Kalapani, census b...
9 Supreme Court justices to boycott full court m...
10 London expands vehicle levy to improve air qua...
11 ‘Children are going to die’, UN agency warns a...
12 EDITORIAL: Railblock ahead\n\nDelay in operati...
13 Nepal reports 673 new Covid-19 cases, 13 death...
14 Whether it’s supply or demand, oil era heads f...
15 Justices to decide their further step after me...
16 Climate change: what are the economic stakes?\...
17 Oslo opens museum to “The Scream” painter Munc...
18 Everything you need to know about the Covid-19...
19 Fauci says vaccines for kids between 5-11 like...
Name: text, dtype: object
Sentiment analysis
from vaderSentiment.vaderSentiment import SentimentIntensityAnalyzer
def sentiment_scores(sentence):
sid_obj = SentimentIntensityAnalyzer()
return sid_obj.polarity_scores(sentence)
df['sentiment_scores'] = df.text.apply(sentiment_scores)
df[['text', 'sentiment_scores']]
text | sentiment_scores | |
---|---|---|
0 | EDITORIAL: Protect paddy farmers\n\nEase of ac... | {'neg': 0.0, 'neu': 0.779, 'pos': 0.221, 'comp... |
1 | Despite some opposition, Oli appears to have p... | {'neg': 0.0, 'neu': 1.0, 'pos': 0.0, 'compound... |
2 | Over 100,000 doses of Pfizer-BioNtech vaccine ... | {'neg': 0.0, 'neu': 0.872, 'pos': 0.128, 'comp... |
3 | Congress may appoint deputy Speaker, leaving s... | {'neg': 0.0, 'neu': 1.0, 'pos': 0.0, 'compound... |
4 | Everything you need to know about the Covid-19... | {'neg': 0.0, 'neu': 1.0, 'pos': 0.0, 'compound... |
5 | United States to provide 100,620 doses of Pfiz... | {'neg': 0.0, 'neu': 0.865, 'pos': 0.135, 'comp... |
6 | Paddy damage by freak rains estimated at Rs8.2... | {'neg': 0.276, 'neu': 0.724, 'pos': 0.0, 'comp... |
7 | Dalit representatives complain of social discr... | {'neg': 0.134, 'neu': 0.753, 'pos': 0.113, 'co... |
8 | Consult Delhi for census in Kalapani, census b... | {'neg': 0.0, 'neu': 1.0, 'pos': 0.0, 'compound... |
9 | Supreme Court justices to boycott full court m... | {'neg': 0.164, 'neu': 0.582, 'pos': 0.255, 'co... |
10 | London expands vehicle levy to improve air qua... | {'neg': 0.145, 'neu': 0.683, 'pos': 0.173, 'co... |
11 | ‘Children are going to die’, UN agency warns a... | {'neg': 0.277, 'neu': 0.723, 'pos': 0.0, 'comp... |
12 | EDITORIAL: Railblock ahead\n\nDelay in operati... | {'neg': 0.113, 'neu': 0.887, 'pos': 0.0, 'comp... |
13 | Nepal reports 673 new Covid-19 cases, 13 death... | {'neg': 0.17, 'neu': 0.83, 'pos': 0.0, 'compou... |
14 | Whether it’s supply or demand, oil era heads f... | {'neg': 0.061, 'neu': 0.939, 'pos': 0.0, 'comp... |
15 | Justices to decide their further step after me... | {'neg': 0.076, 'neu': 0.762, 'pos': 0.162, 'co... |
16 | Climate change: what are the economic stakes?\... | {'neg': 0.0, 'neu': 0.901, 'pos': 0.099, 'comp... |
17 | Oslo opens museum to “The Scream” painter Munc... | {'neg': 0.0, 'neu': 1.0, 'pos': 0.0, 'compound... |
18 | Everything you need to know about the Covid-19... | {'neg': 0.0, 'neu': 1.0, 'pos': 0.0, 'compound... |
19 | Fauci says vaccines for kids between 5-11 like... | {'neg': 0.0, 'neu': 1.0, 'pos': 0.0, 'compound... |
sentiments = pd.concat([df, df.sentiment_scores.apply(pd.Series)], axis=1)[['text', 'neg', 'neu', 'pos']]
sentiments
text | neg | neu | pos | |
---|---|---|---|---|
0 | EDITORIAL: Protect paddy farmers\n\nEase of ac... | 0.000 | 0.779 | 0.221 |
1 | Despite some opposition, Oli appears to have p... | 0.000 | 1.000 | 0.000 |
2 | Over 100,000 doses of Pfizer-BioNtech vaccine ... | 0.000 | 0.872 | 0.128 |
3 | Congress may appoint deputy Speaker, leaving s... | 0.000 | 1.000 | 0.000 |
4 | Everything you need to know about the Covid-19... | 0.000 | 1.000 | 0.000 |
5 | United States to provide 100,620 doses of Pfiz... | 0.000 | 0.865 | 0.135 |
6 | Paddy damage by freak rains estimated at Rs8.2... | 0.276 | 0.724 | 0.000 |
7 | Dalit representatives complain of social discr... | 0.134 | 0.753 | 0.113 |
8 | Consult Delhi for census in Kalapani, census b... | 0.000 | 1.000 | 0.000 |
9 | Supreme Court justices to boycott full court m... | 0.164 | 0.582 | 0.255 |
10 | London expands vehicle levy to improve air qua... | 0.145 | 0.683 | 0.173 |
11 | ‘Children are going to die’, UN agency warns a... | 0.277 | 0.723 | 0.000 |
12 | EDITORIAL: Railblock ahead\n\nDelay in operati... | 0.113 | 0.887 | 0.000 |
13 | Nepal reports 673 new Covid-19 cases, 13 death... | 0.170 | 0.830 | 0.000 |
14 | Whether it’s supply or demand, oil era heads f... | 0.061 | 0.939 | 0.000 |
15 | Justices to decide their further step after me... | 0.076 | 0.762 | 0.162 |
16 | Climate change: what are the economic stakes?\... | 0.000 | 0.901 | 0.099 |
17 | Oslo opens museum to “The Scream” painter Munc... | 0.000 | 1.000 | 0.000 |
18 | Everything you need to know about the Covid-19... | 0.000 | 1.000 | 0.000 |
19 | Fauci says vaccines for kids between 5-11 like... | 0.000 | 1.000 | 0.000 |