The Emojis of Kathmandu
50K tweets between 2012 to 2021 (containing emojis, and geocoded to Kathmandu) show that we really love ๐
Recently though, we have been ๐ less and ๐ more, ๐ less and โค๏ธ more. Just goes to show Kathmandu is healing from Covid.
Github repo: https://github.com/ayushsubedi/emojan
Create the dataset using twint
import twint
import pandas as pd
import nest_asyncio
nest_asyncio.apply()
emoji_list = ['๐',
'๐',
'๐',
'๐',
'๐ณ๐ต',
'๐',
'โ',
'๐ถ',
'๐',
'๐ท',
'๐',
'๐',
'๐',
'๐',
'๐',
'โบ๏ธ',
'๐ด',
'๐ฑ',
'๐',
'๐',
'๐',
'๐',
'๐',
'๐',
'๐ฉ',
'๐',
'๐',
'๐ญ',
'๐ณ',
'๐',
'๐',
'๐',
'โ๏ธ',
'โ๏ธ',
'๐',
'โค๏ธ',
'๐',
'๐']
for emoji in emoji_list:
c = twint.Config()
c.Search = emoji
c.Pandas = True
c.Store_csv = True
c.Output = emoji
c.Hide_output= True
c.Near= "kathmandu"
c.Since="2010-01-01"
c.Until = "2021-05-31"
twint.run.Search(c)
Perform analysis
import pandas as pd
df = pd.read_csv("../datasets/merged.csv", lineterminator='\n', parse_dates=['date'])
list(df)
['id', 'emoji', 'date', 'username', 'tweet', 'likes_count', 'place']
df.username.value_counts().head(20)
ms_madhur 1433
sristee44 1121
beingsamikshya 871
anuskashresthax 618
nepalplanettrek 594
scousergirl 572
dreamingdr 569
nepaligentleman 476
itsme_shivangi 433
iamnabinraj75 410
sim_shrestha 402
milan_pu1 353
raunakbasnet1 332
thulokanxo 325
chetan_karki 302
saampokhrel 301
mongolianheartk 300
tenzintsetenbhu 290
rana1997rohit 275
fatyangri 256
Name: username, dtype: int64
df.emoji.value_counts().head(20)
๐ 7576
๐ 4016
๐ 3781
๐ 3482
๐ณ๐ต 2785
๐ 2270
โค๏ธ 2233
๐ 2122
๐ 2000
๐ 1463
๐ 1432
๐ 1241
๐ 1173
๐ 1095
๐ 1001
๐ 730
๐ญ 637
๐ 554
โ๏ธ 505
๐ 483
Name: emoji, dtype: int64
df.date.min()
Timestamp('2012-01-28 00:00:00')
df.set_index('date').resample('M')['id'].count()
date
2012-01-31 2
2012-02-29 16
2012-03-31 13
2012-04-30 3
2012-05-31 11
...
2021-01-31 212
2021-02-28 134
2021-03-31 183
2021-04-30 160
2021-05-31 119
Freq: M, Name: id, Length: 113, dtype: int64
df.set_index('date').resample('M')['id'].count().plot()
<AxesSubplot:xlabel='date'>
df[df.emoji=="โค๏ธ"].set_index('date').resample('M')['id'].count().plot()
<AxesSubplot:xlabel='date'>
df[df.emoji=="๐ท"].set_index('date').resample('M')['id'].count().plot()
<AxesSubplot:xlabel='date'>
df[df.emoji=="๐ณ๐ต"].set_index('date').resample('M')['id'].count().plot()
<AxesSubplot:xlabel='date'>
df[df.emoji=="๐"].set_index('date').resample('M')['id'].count().plot()
<AxesSubplot:xlabel='date'>
location_tweet = df.dropna(subset=['place'])
import json
import math
def getlatlon(row):
place = row.place
place = place.replace("\'", "\"")
place = json.loads(place)
lat = place['coordinates'][0]
lon = place['coordinates'][1]
return pd.Series([lat, lon],index=['lat','lon'])
df = df.join(location_tweet.apply(getlatlon, axis=1, result_type="expand"))
df.to_csv('location.csv', index=False)
df
id | emoji | date | username | tweet | likes_count | place | lat | lon | |
---|---|---|---|---|---|---|---|---|---|
0 | 1398437418768818176 | ๐ | 2021-05-29 | dendikapan | เคถเฅเคญ เคฌเคฟเคนเคพเคจเฅ โ๏ธโ๏ธ เคถเฅเคญเคฆเคฟเคจเคเคพเฅ เคเคพเคฎเคจเคพ๐๐ท๐๐ณ๐ต๐ Hopefull... | 3 | NaN | NaN | NaN |
1 | 1397792077149216769 | ๐ | 2021-05-27 | dendikapan | @damanbro66 Yes I like it ๐so much your post ... | 0 | NaN | NaN | NaN |
2 | 1397341721566990336 | ๐ | 2021-05-26 | dendikapan | @KiranCh77 Really nice ๐thanks for sharing ๐ ... | 0 | NaN | NaN | NaN |
3 | 1394863068035813379 | ๐ | 2021-05-19 | iammuhnaj | It's still Bull season๐ @ ๐๐๐๐ https://t.co/O... | 1 | {'type': 'Point', 'coordinates': [27.713776, 8... | 27.713776 | 85.310244 |
4 | 1393821979032133635 | ๐ | 2021-05-16 | chetan_karki | #just #fun with #baby #myrahsofkarki ๐โค๏ธ looki... | 0 | {'type': 'Point', 'coordinates': [27.67733347,... | 27.677333 | 85.307636 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
45362 | 389751359874289664 | โค๏ธ | 2013-10-14 | ashmoo_moo | #littleone #cuteness #dashain #tika #sister #c... | 0 | {'type': 'Point', 'coordinates': [27.73048071,... | 27.730481 | 85.330964 |
45363 | 389704517815902208 | โค๏ธ | 2013-10-14 | ashmoo_moo | Baba. Prajjwal Dai. โค๏ธ #dashain #tika #family ... | 0 | {'type': 'Point', 'coordinates': [27.73855019,... | 27.738550 | 85.338760 |
45364 | 387917234490048512 | โค๏ธ | 2013-10-09 | ashmoo_moo | #brother #cousins #mamaghar #nagpokhari #morni... | 0 | {'type': 'Point', 'coordinates': [27.71356987,... | 27.713570 | 85.324463 |
45365 | 387542385145954304 | โค๏ธ | 2013-10-08 | ashmoo_moo | #cuteness #babysister #cousins #smile โค๏ธ๐๐ @ M... | 0 | {'type': 'Point', 'coordinates': [27.7090305, ... | 27.709031 | 85.326469 |
45366 | 386357639460179968 | โค๏ธ | 2013-10-05 | ashmoo_moo | on our way back homeeeeee.... โค๏ธ๐ #lastnight #... | 0 | {'type': 'Point', 'coordinates': [27.73868383,... | 27.738684 | 85.338705 |
45367 rows ร 9 columns
df.emoji.value_counts()
๐ 7576
๐ 4016
๐ 3781
๐ 3482
๐ณ๐ต 2785
๐ 2270
โค๏ธ 2233
๐ 2122
๐ 2000
๐ 1463
๐ 1432
๐ 1241
๐ 1173
๐ 1095
๐ 1001
๐ 730
๐ญ 637
๐ 554
โ๏ธ 505
๐ 483
๐ 461
๐ 432
๐ 423
โบ๏ธ 375
๐ 373
๐ 332
๐ฑ 314
๐ 296
๐ 292
๐ 291
โ 277
๐ 213
๐ณ 185
๐ฉ 181
๐ด 163
๐ท 135
๐ถ 45
Name: emoji, dtype: int64