SEARCHING FOR RELEVANT TWEETS BASED ON TOPIC-RELATED USER ACTIVITIES
Keywords:Social media, Twitter, social network analysis, search, graph-based approach
Twitter is one of the largest social media. Although it can be used to get information on a topic of interest, it is not easy for us to find tweets relevant to the topic due to a massive amount of tweets and the small size of each tweet. Some relevant tweets may not include any terms explicitly related to the topic, and general content-based keyword search techniques and query expansion techniques are not effective for finding such relevant tweets. To solve this problem, we present a method for finding tweets on a topic of interest based on the Twitter user activities related to the topic such as tweet, retweet, and reply. The method consists of two phases: the preparation phase and the main phase. In the preparation phase, we create a user-tweet reference graph representing the relation between users and tweets based on the past user activities related to the topic, calculate the influence of each user and tweet in the topic, then define two types of each user’s power, called “Voice” and “Impact”, indicating “how much voice the user has on the topic” and “how much impact the user has on the other users’ tweets on the topic”. In the main phase, we calculate the relevance of newly-arrived tweets to the topic according to the Voice and the Impact score of the users who posted, retweeted, or replied to each of the tweets, then rank the tweets by the relevance score. The two phases are processed independently. Once the preparation phase is completed, the main phase can return the final result any time. Experimental results show that “who retweeted or replied to the tweet” is more effective for judging the relevance of each tweet to the topic than “who posted the tweet”, and our method can find relevant tweets which do not include any terms explicitly related to the topic. We compare our method with an indegree-based method and a PageRank-based method, and show that our method outperforms the methods compared.
P.B. Brandtzæg and J. Heim. Why People Use Social Networking Sites. In 3rd Int. Conf. on
Online Communities and Social Computing, pages 143–152, 2009.
A. Smith. Why American Use Social Media, 2011. http://www.pewinternet.org/2011/11/15/
F. Gruber. Why User Social Media in the First Place?, 2014. http://tech.co/
Twitter. About Twitter, Inc. https://about.twitter.com/company.
J. Hannon, M. Bennett, and B. Smyth. Recommending Twitter Users to Follow Using Content and
Collaborative Filtering Approaches. In 4th ACM Conf. on Recommender Systems, pages 199–206,
J. Weng, E.-P. Lim, J. Jiang, and Q. He. TwitterRank: Finding Topic-sensitive Influential Twit-
terers. In 3rd ACM Int. Conf. on Web Search and Data Mining, pages 261–270, 2010.
T. Noro, F. Ru, F. Xiao, and T. Tokuda. Twitter User Rank Using Keyword Search. In Information
Modelling and Knowledge Bases XXIV, volume 251 of Frontiers in Artificial Intelligence and
Applications, pages 31–48. IOS Press, 2013.
T. Noro and T. Tokuda. Effectiveness of Incorporating Follow Relation into Searching for Twitter
Users to Follow. In 14th Int. Conf. on Web Engineering, pages 420–429, 2014.
K. Slabbekoorn, T. Noro, and T. Tokuda. Towards Twitter User Recommendation Based on User
Relations and Taxonomical Analysis. In Information Modelling and Kowledge Bases XXV, volume
of Frontiers in Artificial Intelligence and Applications, pages 115–132. IOS Press, 2014.
F. Xiao, T. Noro, and T. Tokuda. Finding News-Topic Oriented Influential Twitter Users Based
on Topic Related Hashtag Community Detection. Journal of Web Engineering, 13(5&6):405–429,
J. Lehmann, R. Isele, M. Jakob, A. Jentzsch, D. Kontokostas, P.N. Mendes, S. Hellmann,
M. Morsey, P. van Kleef, S. Auer, and C. Bizer. DBpedia - A Large-scale, Multilingual Knowledge
Base Extracted from Wikipedia. Semantic Web Journal, 6(2):167–195, 2015.
F.M. Suchanek, G. Kasneci, and G. Weikum. YAGO: A Large Ontology from Wikipedia and
WordNet. Elsevier Journal of Web Semantics, 6(3):203–217, 2008.
F. Mahdisoltani, J. Biega, and F.M. Suchanek. YAGO3: A Knowledge Base from Multilingual
Wikipedias. In 7th Biennial Conf. on Innovative Data Systems Research, 2015.
K. Tao, F. Abel, C. Hauff, and G.-J. Houben. Twinder, A Search Engine for Twitter Streams. In
th Int. Conf. on Web Engineering, pages 153–168, 2012.
Y. Duan, L. Jiang, T. Qin, M. Zhou, and H.-Y. Shum. An Empirical Study on Learning to Rank
of Tweets. In 23rd Int. Conf. on Computational Linguistics, pages 295–303, 2010.
A. Singhal, C. Buckley, and M. Mitra. Pivoted Document Length Normalization. In 19th Annual
Int. ACM SIGIR Conf. on Research and Development in Information Retrieval, pages 21–29, 1996.
I. Uysal and W.B. Croft. User Oriented Tweet Ranking: A Filtering Approach to Microblogs. In
th ACM Int. Conf. on Information and Knowledge Management, pages 2261–2264, 2011.
M. Cha, H. Haddadi, F. Benevenuto, and K.P. Gummadi. Measuring user influence in Twitter:
The million follower fallacy. In 4th International AAAI Conf. on Weblogs and Social Media, pages
L. Page, S. Brin, R. Motwani, and T. Winograd. The PageRank Citation Ranking: Bringing Order
to the Web. Technical report, Stanford University, 1998.
G. Neubig and K. Duh. How Much Is Said in a Tweet? In AAAI 2013 Spring Symposium on
Analyzing Microtext, pages 32–39, 2013.
K. Jarvelin and J. Kekalainen. Cumulated Gain-Based Evaluation of IR Techniques. ACM Trans-
actions on Information Systems, 20(4):422–446, 2002.
M. Kendall. A new measure of rank correlation. Biometrika, 30(1-2):81–93, 1938.
P. Jaccard. The distribution of the flora in the Alpine zone. New Phytologist, 11(2):37–50, 1912.