Sonora: A Prescriptive Model for Message Authoring on Twitter

Wednesday, September 24, 2014 - 16:30
TH 331
Neal Lewis (IBM Almaden Research Center)
Within social networks, certain messages propagate with more ease or attract more attention than others. This effect can be a consequence of several factors, such as topic of the message, number of followers, real-time relevance, person who is sending the message etc.Only one of these factors is within a user’s reach at authoring time: how to phrase the message. In this paper we examine how word choice contributes to the propagation of a message. We present a prescriptive model that analyzes words based on their historic performance in retweets in order to propose enhancements in future tweet perfor- mance. Our model calculates a novel score (SONORA SCORE) that is built on three aspects of diffusion - volume, prevalence and sustain. We show that SONORA SCORE has powerful predictive ability, and that it complements social and tweet-level features to achieve an F1 score of 0.82 in retweet prediction. Moreover, it has the ability to prescribe changes to the tweet wording such that when the SONORA SCORE for a tweet is higher, it is twice as likely to have more retweets. Lastly, we show how our prescriptive model can be used to assist users in content creation for optimized success on social media. Because the model works at the word level, it lends itself extremely well to the creation of user interfaces which help authors incrementally – word by word – refine their message until its potential is maximized and it is ready for publication. We present an easy to use iOS application that illustrates the potential of incremental refinement using SONORA SCORE coupled with the familiarity of a traditional spell checker.

Neal Lewis is a SFSU alumni and Research Engineer at the IBM Almaden Research Center. Neal works for the newly form IBM Watson Group within the Core Technology division focusing on Contextual Analytics and Linguistic Inference. His research and professional interests include Natural Language Processing, Data Mining, and Systems.