games and twitter
1. Introduction
Motivation:
Along with the explosion of popularity of mobile devices, mobile games also become more and more popular nowadays. From retro games such as Plants vs. Zombies and Flappy Bird to more recent multiplayer games such as Clash of Clans and Pokemon Go, there are a variety of them which all made big success but with very different strategies.Investigating how popular they are and what make them popular or not would bring instructional information which is invaluable to determine a better short/long term plan of their product and service.
Text mining via social media such as Twitter comes in handy in this kind of task. As Twitter data constitutes a rich source of information about any topic imaginable, they can be used to find trends related to a specific keyword, measuring brand sentiment, and gathering feedback about new products and services.
In this proposal, I will demonstrate the possibility to use Twitter to (1) make very efficient and productive survey in the mobile game business, and (2) perform detailed analysis on given product (here use Pokemon Go as an example) to find out important information such as hot terms/topic, product usage patterns, and the customer sentiment of the product. The analysis can be found in this notebook.
Methods/Tools:
I use Twitter Streaming API together with Python libraries such as $\texttt{Tweepy}$ and $\texttt{twitterscraper}$ to collect user information and relevant tweets (in the past 1-2 weeks) about certain set of popular mobile games. Use Natural Language Processing libraries such as $\texttt{NLTK}$ to find keywords which characterize the game, and use $\texttt{matplotlib}$ and $\texttt{Vincent}$ to render plots.
2. Data collection
For the popularity survey, I first use Twitter Streaming API together with Python libraries to extract user information, such as number of followers, created time, and number of likes, from the official twitter account of a list of 14 most popular mobile games, based on internet reviews in 2015 and 2016.
I then collect all tweets that are related to each game in the past week (Oct-23 to Oct-29, 2016) using web scraping library twitterscraper. This results in about 120K tweets in total.
In order to perform more detailed study of a given mobile game, I pick the most popular game Pokemon Go as an example, and expand the data set to include all tweets within the past two weeks. That gives a total tweets count about 120K for Pokemon Go alone.
3. Popularity Survey
3-1. user information from official account
One way we can get an easy measurement of if a game is popular or not is to pull out some information associated with its official twitter account, such as the number of followers and number of likes. Normally, a more popular game would be garnished with more followers and more likes. However, cautions are needed as not every twitter user who plays a particular game would follow its official account on twitter.
The following plot shows total number of followers to-date of a list of 14 popular mobile games (list based on internet reviews in 2015 and 2016): There are two winners: Clash of Clans and Pokemon Go, both have followers more than 2 millions, which clearly shows both are or were very popular games (follower number only goes up with time, so there is a bias toward games with longer time spans.)
In order to show the bias, or extract a unbiased analysis, we here plot number of followers (left) and number of days of the game’s lifetime (right) together. Note Pokemon Go only released this June, but the official account was registered in 2014 which is when that game is conceived. Both Clash of Clans and Pokemon Go have a lifetime close or below the average ages of the list of games we considered. However their follower counts are exceptional, clearly indicate their popularity in recent years. Especially for Pokemon Go, it shows a strong momentum to keep its popularity.
On the other hand, number of likes might not be a good indicator of a game’s recent popularity. Some games gained more likes due to their longevity, such as angry birds and Plants vs Zombies. Some popular games such as Pokemon Go gains less likes partly because of its very young age. One exception is Ingress, a location based, multiplayer online game, a relative of Pokemon Go. It attracts the third highest number of likes within a time span less than the averaged lifetime of the games in the list.
2.2 tweets count
Another way to measure the popularity of a game is by counting how many times twitter users mentioned the game. I therefore scrapped the tweets on the web within the past week for the following six games: Candy Crush, Clash of Clans, Flappy Bird, Plants vs Zombies, Subway Surfers, Pokemon Go.
The following plot shows the number of tweets that mentioned individual game in the past week.
- It is not surprising to see Pokemon Go alone hits 70K tweets,while the rest of five combined only gives 10K.
It indicates that,
- 1) overall Pokemon Go is the most dominant mobile game based on social media , or
- 2) Pokemon Go players tweet much more often (almost 100 times) than other game players do
although the second might not be so likely. Further study with data over longer time span, and/or get user information might be very useful to tell. In any rate, Pokemon Go does a great job to get its players excited indicated by the huge amount of tweets over a relatively short period of time.
3. Pattern of usage
3-1. weekly pattern
Before we move on to the text content, lets do a histogram of total count grouped by days of week. For this and following study, I have doubled the data size which now spans over the past two weeks (120k tweets in total). Based on the following plot,
- there is a clear uprising trend from Monday to Thursday, and a sudden drop on Friday and then gradually climb up back during the weekend. Whether it is an indicator of people spend more time on Pokemon Go from Monday to Thursday, or people tend to tweet more during Mon to Thursday is unclear at this moment.
3-2. daily pattern¶
- Most of the tweets occur between the day time from 6am to 4pm which is reasonable
- It peaks at the noon time, which is perhaps because people have more free time during lunch
4. Text analysis
4-1. terms occurrence
By searching and sorting all meaningful single words in the text of tweets within the last two weeks, here are the first 10 most frequent words:
*(‘twitter’, 38406), (‘pic’, 23536), (‘halloween’, 10795), (‘video’, 9545), (‘new’, 8537), (‘update’, 7604), (‘liked’, 7068), (‘event’, 6850), (‘play’, 5300), (‘get’, 4227)
- The words ‘pic’ and ‘video’ clearly show the popular media content of the tweets;
- ‘halloween’, ‘new’, and ‘update’ are related to recent update of the App;
- ‘halloween’ and ‘event’ refer to the newly announced halloween event which aims to stimulate the players to play even in this cool weather.
Other rankings as follows:
the first 20 most frequent hashtag
- (‘#pokemongo’, 45988), (‘#pokemon’, 6280), (‘#funny’, 3757), (‘#minecraft’, 3723), (‘#agario’, 3703), (‘#amazing’, 3695), (‘#game’, 3501), (‘#new’, 3342), (‘#europe’, 3319), (‘#turkey’, 3319), (‘#trolling’, 3318), (‘#love’, 2584), (‘#pok’, 2055), (‘#gaming’, 1163), (‘#pokeballs’, 1053), (‘#pokemongocoinspic’, 1015), (‘#teamvalor’, 779), (‘#pokecoins’, 605), (‘#tech’, 601), (‘#halloween’, 555)
the first 10 most frequent mention
- (‘@youtube’, 9318), (‘@leafyishere’, 1304), (‘@omgitsalia’, 1239), (‘@pokemongoapp’, 1210), (‘@trnrtips’, 1029), (‘@nianticlabs’, 907), (‘@fsu_atl’, 522), (‘@lachlanyt’, 301), (‘@suknives’, 286), (‘@witelightinghwd’, 222)
A search based on bi-gram gives the first 20 most frequent two adjacent words:
- (‘pic twitter’, 23446), (‘@youtube video’, 7179), (‘#pokemongo pic’, 4509), (‘halloween event’, 3673), (‘#game #trolling’, 3318), (‘play pokemon’, 2750), (‘go update’, 2228), (‘#love #amazing’, 2199), (‘new pokemon’, 1497), (‘halloween update’, 1182), (‘celebrates halloween’, 968), (‘need pokecoins’, 911), (‘rare pokemon’, 814), (‘apple pen’, 768), (‘pine apple’, 764), (‘ppap pine’, 764), (‘still play’, 747), (‘still playing’, 744), (‘candy count’, 695), (‘go cheats’, 685)
which indicates that
- Pokemon Go players and sellers use twitter and youtube frequently;
- recent update and event also show up timely in tweets;
- tweets show great need of pokecoins, candy and rare species etc. which might poses difficulties for current and potential users;
- Pokemon Go game is very addictive as people like to use phrases such as ‘still play/playing’.
- Pokemon Go’s influence even reaches popular songs (e.g., Pen Pineapple Apple Pen).
To sum up the finding, Pokemon Go clearly has very good and successful marketing strategies to stimulate their players such as holding events, and sharing pic/movies with/among the users. It is also integrated in the pop culture very well in which the game can benefit from. Certain needs such as pokecoins and rare species might pose difficulties for current and potential players, and the effects of that should be further investigated.
4-2. sentiment analysis
We can try to measure the sentiment scores of the tweets about Pokemon Go, which provides a way to tell game players’ opinion about the game. We use the same data sample as above, but only use the most recent 4000 tweets as a test and run those tweets through a sentiment classifier built base on Word Sense Disambiguation using wordnet and word occurance statistics from nltk.
- Based on the 4000 tweets, we find a much greater positive score (85.6) than the total negative score (33.9), indicates Pokemon Go is overall receiving a very positive feedback.
Similar analysis can be applied to other games too, and a comparison among a pool of mobile games would also show how much people like the games.
5. Summary
In this proposal, I have demonstrated that we could make use of the data from Twitter to gain insight in the mobile game industry.
- We can perform very efficient and productive survey, which benifits both the game development and marketing
- We can also perform detailed analysis on given product (here use Pokemon Go as an example) to find out important information such as hot terms/topic, product usage patterns, and the customer’s opinion of the product.
Further improvement:
- bigger data set and larger sample of mobile games
- distinguish tweets from different categories, such as original post or retweet, ads or sale, etc.
- quantitative measure of the relationship between the twitter users and the real game players