Spam in Online Social Networks (OSNs) is a systemic problem that imposes a threat to these services in terms of undermining their value to advertisers and potential investors, as well as negatively affecting users’ engagement. While the well-studied email spam is (almost!) a solved problem, OSNs spam is a very different and interesting problem. For instance, traditional email spam usually consists of spreading bulks of unsolicited messages to numerous recipients, however, spam on Twitter or Facebook does not necessarily comply to the volume constraint, as a single spam message is capable of propagating through social interaction functions and reach a wide audience. Also, a recent study showed that the largest suspended Twitter accounts campaigns directed users via affiliate links to some reputable websites that generate income on a purchase, such as Amazon. Such findings blur the line about what constitutes as OSN spam.

Example of the most popular Arabic spam campaign on Twitter in 2013

Example of the most popular Arabic spam campaign on Twitter in 2013


Typically, the goal of the designers of these social bots is to make them exhibit human-like behavior. In some sense, spammers share the Artificial Intelligence researchers dream of designing a computer algorithm that passes the Turing test (requiring that a human being should be unable to distinguish the machine from another human being). 

So how can we detect these accounts that endanger the online ecosystems as well as our society?

In this work, we compare normal and malicious users on Twitter in terms of their behavioral properties. We find that there exist two behaviorally distinct categories of spammers, just like viruses: 1) naive, short lived & aggressive; 2) sophisticated, stealthy that embeds itself first. We then analyze the detectability of these spam accounts with respect to three categories of features, namely, content attributes (linguistic cue ), social interactions (dimensions of information diffusion patterns ), and profile properties (metadata related to the account). Our biggest finding was that that malicious accounts can easily generate chatter that is indistinguishable from benign (human) users, while it is much harder for these malicious bots to mimic the social interactions of human users.