The Emojis of Kathmandu

50K tweets between 2012 to 2021 (containing emojis, and geocoded to Kathmandu) show that we really love ๐Ÿ˜‚

Recently though, we have been ๐Ÿ˜‚ less and ๐Ÿ™ more, ๐Ÿ˜ less and โค๏ธ more. Just goes to show Kathmandu is healing from Covid.

Github repo: https://github.com/ayushsubedi/emojan

Create the dataset using twint

import twint
import pandas as pd
import nest_asyncio
nest_asyncio.apply()
emoji_list = ['๐Ÿ˜',
 '๐Ÿ˜’',
 '๐Ÿ˜Š',
 '๐Ÿ˜ˆ',
 '๐Ÿ‡ณ๐Ÿ‡ต',
 '๐Ÿ˜Œ',
 'โ˜•',
 '๐Ÿ‘ถ',
 '๐Ÿ‘',
 '๐Ÿ˜ท',
 '๐Ÿ‘Œ',
 '๐ŸŒž',
 '๐Ÿ˜‘',
 '๐Ÿ˜‰',
 '๐Ÿ˜',
 'โ˜บ๏ธ',
 '๐Ÿ˜ด',
 '๐Ÿ˜ฑ',
 '๐Ÿ™',
 '๐Ÿ˜˜',
 '๐Ÿ™Œ',
 '๐Ÿ˜”',
 '๐Ÿ˜‹',
 '๐Ÿ˜‚',
 '๐Ÿ˜ฉ',
 '๐Ÿ’•',
 '๐Ÿ˜Ž',
 '๐Ÿ˜ญ',
 '๐Ÿ˜ณ',
 '๐Ÿ˜‡',
 '๐Ÿ˜',
 '๐Ÿ˜œ',
 'โ˜•๏ธ',
 'โœŒ๏ธ',
 '๐Ÿ™ˆ',
 'โค๏ธ',
 '๐Ÿ˜„',
 '๐Ÿ’ž']
for emoji in emoji_list:
    c = twint.Config()
    c.Search = emoji
    c.Pandas = True
    c.Store_csv = True
    c.Output = emoji
    c.Hide_output= True
    c.Near= "kathmandu"
    c.Since="2010-01-01"
    c.Until = "2021-05-31"
    twint.run.Search(c)

Perform analysis

import pandas as pd
df = pd.read_csv("../datasets/merged.csv", lineterminator='\n', parse_dates=['date'])
list(df)
['id', 'emoji', 'date', 'username', 'tweet', 'likes_count', 'place']
df.username.value_counts().head(20)
ms_madhur          1433
sristee44          1121
beingsamikshya      871
anuskashresthax     618
nepalplanettrek     594
scousergirl         572
dreamingdr          569
nepaligentleman     476
itsme_shivangi      433
iamnabinraj75       410
sim_shrestha        402
milan_pu1           353
raunakbasnet1       332
thulokanxo          325
chetan_karki        302
saampokhrel         301
mongolianheartk     300
tenzintsetenbhu     290
rana1997rohit       275
fatyangri           256
Name: username, dtype: int64
df.emoji.value_counts().head(20)
๐Ÿ˜‚     7576
๐Ÿ˜Š     4016
๐Ÿ˜     3781
๐Ÿ˜     3482
๐Ÿ‡ณ๐Ÿ‡ต    2785
๐Ÿ˜Ž     2270
โค๏ธ    2233
๐Ÿ™     2122
๐Ÿ˜‰     2000
๐Ÿ˜œ     1463
๐Ÿ˜˜     1432
๐Ÿ˜‹     1241
๐Ÿ’•     1173
๐Ÿ˜„     1095
๐Ÿ‘     1001
๐Ÿ‘Œ      730
๐Ÿ˜ญ      637
๐Ÿ˜      554
โœŒ๏ธ     505
๐Ÿ˜’      483
Name: emoji, dtype: int64
df.date.min()
Timestamp('2012-01-28 00:00:00')
df.set_index('date').resample('M')['id'].count()
date
2012-01-31      2
2012-02-29     16
2012-03-31     13
2012-04-30      3
2012-05-31     11
             ... 
2021-01-31    212
2021-02-28    134
2021-03-31    183
2021-04-30    160
2021-05-31    119
Freq: M, Name: id, Length: 113, dtype: int64
df.set_index('date').resample('M')['id'].count().plot()
<AxesSubplot:xlabel='date'>

png

df[df.emoji=="โค๏ธ"].set_index('date').resample('M')['id'].count().plot()
<AxesSubplot:xlabel='date'>

png

df[df.emoji=="๐Ÿ˜ท"].set_index('date').resample('M')['id'].count().plot()
<AxesSubplot:xlabel='date'>

png

df[df.emoji=="๐Ÿ‡ณ๐Ÿ‡ต"].set_index('date').resample('M')['id'].count().plot()
<AxesSubplot:xlabel='date'>

png

df[df.emoji=="๐Ÿ˜‚"].set_index('date').resample('M')['id'].count().plot()
<AxesSubplot:xlabel='date'>

png

location_tweet = df.dropna(subset=['place'])
import json
import math

def getlatlon(row):
    place = row.place
    place = place.replace("\'", "\"")
    place = json.loads(place)
    lat = place['coordinates'][0]
    lon = place['coordinates'][1]
    return pd.Series([lat, lon],index=['lat','lon'])
df = df.join(location_tweet.apply(getlatlon, axis=1, result_type="expand"))
df.to_csv('location.csv', index=False)
df

id emoji date username tweet likes_count place lat lon
0 1398437418768818176 ๐Ÿ‘Œ 2021-05-29 dendikapan เคถเฅเคญ เคฌเคฟเคนเคพเคจเฅ€ โ˜•๏ธโ˜•๏ธ เคถเฅเคญเคฆเคฟเคจเค•เคพเฅ‡ เค•เคพเคฎเคจเคพ๐Ÿ™๐Ÿ˜ท๐Ÿ‘Œ๐Ÿ‡ณ๐Ÿ‡ต๐Ÿ’ž Hopefull... 3 NaN NaN NaN
1 1397792077149216769 ๐Ÿ‘Œ 2021-05-27 dendikapan @damanbro66 Yes I like it ๐Ÿ‘Œso much your post ... 0 NaN NaN NaN
2 1397341721566990336 ๐Ÿ‘Œ 2021-05-26 dendikapan @KiranCh77 Really nice ๐Ÿ‘Œthanks for sharing ๐Ÿ‘Œ ... 0 NaN NaN NaN
3 1394863068035813379 ๐Ÿ‘Œ 2021-05-19 iammuhnaj It's still Bull season๐Ÿ‘Œ @ ๐™ƒ๐™Š๐™ˆ๐™€ https://t.co/O... 1 {'type': 'Point', 'coordinates': [27.713776, 8... 27.713776 85.310244
4 1393821979032133635 ๐Ÿ‘Œ 2021-05-16 chetan_karki #just #fun with #baby #myrahsofkarki ๐Ÿ˜Šโค๏ธ looki... 0 {'type': 'Point', 'coordinates': [27.67733347,... 27.677333 85.307636
... ... ... ... ... ... ... ... ... ...
45362 389751359874289664 โค๏ธ 2013-10-14 ashmoo_moo #littleone #cuteness #dashain #tika #sister #c... 0 {'type': 'Point', 'coordinates': [27.73048071,... 27.730481 85.330964
45363 389704517815902208 โค๏ธ 2013-10-14 ashmoo_moo Baba. Prajjwal Dai. โค๏ธ #dashain #tika #family ... 0 {'type': 'Point', 'coordinates': [27.73855019,... 27.738550 85.338760
45364 387917234490048512 โค๏ธ 2013-10-09 ashmoo_moo #brother #cousins #mamaghar #nagpokhari #morni... 0 {'type': 'Point', 'coordinates': [27.71356987,... 27.713570 85.324463
45365 387542385145954304 โค๏ธ 2013-10-08 ashmoo_moo #cuteness #babysister #cousins #smile โค๏ธ๐Ÿ˜˜๐Ÿ’‹ @ M... 0 {'type': 'Point', 'coordinates': [27.7090305, ... 27.709031 85.326469
45366 386357639460179968 โค๏ธ 2013-10-05 ashmoo_moo on our way back homeeeeee.... โค๏ธ๐Ÿ˜˜ #lastnight #... 0 {'type': 'Point', 'coordinates': [27.73868383,... 27.738684 85.338705

45367 rows ร— 9 columns

df.emoji.value_counts()
๐Ÿ˜‚     7576
๐Ÿ˜Š     4016
๐Ÿ˜     3781
๐Ÿ˜     3482
๐Ÿ‡ณ๐Ÿ‡ต    2785
๐Ÿ˜Ž     2270
โค๏ธ    2233
๐Ÿ™     2122
๐Ÿ˜‰     2000
๐Ÿ˜œ     1463
๐Ÿ˜˜     1432
๐Ÿ˜‹     1241
๐Ÿ’•     1173
๐Ÿ˜„     1095
๐Ÿ‘     1001
๐Ÿ‘Œ      730
๐Ÿ˜ญ      637
๐Ÿ˜      554
โœŒ๏ธ     505
๐Ÿ˜’      483
๐Ÿ’ž      461
๐Ÿ˜‡      432
๐Ÿ™Œ      423
โ˜บ๏ธ     375
๐Ÿ˜”      373
๐Ÿ˜Œ      332
๐Ÿ˜ฑ      314
๐Ÿ˜‘      296
๐Ÿ˜ˆ      292
๐Ÿ™ˆ      291
โ˜•      277
๐ŸŒž      213
๐Ÿ˜ณ      185
๐Ÿ˜ฉ      181
๐Ÿ˜ด      163
๐Ÿ˜ท      135
๐Ÿ‘ถ       45
Name: emoji, dtype: int64