Spam detection by message dissemination patterns on social networks
Oral Defence Date:
Thursday, July 11, 2019 - 13:30
Assistant Prof. Hao Yue and Prof. Bill Hsu
Spam has been one of the major concerns to various applications ranging from early instant messaging to modern social networks. Users with malicious intent have used spam campaigns as a tool to achieve their financial, personal, or political purposes. Because of large-scale effects of those campaigns, and their repercussions, it is critical to address these problems not only for preventing exploitation of benign users but also for the sake of keeping the networking platform clean and usable. Existing works detect spam messages in various ways, such as analysing the content of spam messages or analysing individual account for the trustworthiness of its messages. However, with ever-evolving manners of sharing messages and information on social networks, attackers have always found ways to evade detection techniques. In this project, we introduce a more robust mechanism for spam detection. We differentiate spam messages and benign messages based on how the messages are disseminated among users of a close-knit community. We create a vector for each tweet that record the number of times it is retweeted by users with certain interest. We establish a machine learning classifier based on stacked auto-encoder and train it with the vectors to identify spam message. We test this new method over different datasets, and it can achieve the accuracy of 96.1% in predicting whether a given message is spam or benign, which outperforms most of the modern spam detection techniques for social networks.
spam detection, message dissemination paths, stacked auto-encoders, topics of interest, strongly connected graphs