Master Thesis Defense: İnanç Arın
  • FENS
  • Master Thesis Defense: İnanç Arın

You are here

IDENTIFICATION OF ANONYMOUS USERS IN TWITTER
 

İnanç Arın
Computer Science and Engineering, MSc Program, 2012 

 

Thesis Jury

Assoc. Prof. Yücel Saygin (Thesis Supervisor), Assoc. Prof. Berrin Yanikoğlu, Asst. Prof. Hüsnü Yenigün, Assoc. Prof. Mehmet Ercan Nergiz, Assoc. Prof. Tonguç Ünlüyurt

Date & Time: August 2nd, 2012  – 11:00

Place: FENS L063

 

Abstract

Users may have multiple profiles when writing comments, blogs, and tweets on the web. While some of these profiles reveal true identity, the others are created under pseudonyms. This is essential especially in the countries with oppressive governments where activists are writing pseudonymous tweets or Facebook messages. In these countries, government officials discovering the fact that a person is among the activists may have serious consequences, the activist being imprisoned, or even his or her life being jeopardized. Pseudonyms may provide a sense of anonymity, however the writing patterns of an author can provide clues that can be used to link the pseudonymous account to the public account. More specifically, one can look at some features within the text whose author is known, and build a model by using these features to predict whether a given (supposedly) anonymous text belongs to that author or not. In this work, we first demonstrate that a person can be identified as being part of a group by using his/her tweets. We used twitter since it is a popular platform, but the problem is not specific to twitter. We show that through tweets, an adversary can build a classifier from public tweets of known users to match them with pseudonymous twitter accounts. We use a simple vector-space model with tf-idf weights to represent documents and a Naive-Bayes classifer with cosine similarity measure. We show that the problem of matching public and pseudonymous accounts exists in twitter through experiments with real data. We also provide a formalism to describe the problem and based on the formalism we provide a solution to protect the privacy of individuals who would like to stay anonymous when writing tweets.