Effective Topic Detection over Social Media

PhD Thesis Proposal Defence

Title: "Effective Topic Detection over Social Media"




Nowadays, Social Networks (SNs) like Facebook and Twitter are very popular. 
Thousands of users post tweets every day. In this proposal, we are dealing with 
two common issues of processing tweets. Firstly, we filter out the most 
significant messages of a corpus of tweets, so that we can clear our dataset 
from noise and extract information from important only messages. Secondly, we 
propose a topic detection model that incorporates time and location.

Concerning filtering of tweets, we propose a method for classifying tweet 
messages into two classes: informative and non-informative. We consider 
informative messages those that contain information that interest the pub- lic, 
trends, events and news. Non-informative tweets are personal messages that do 
not interest the public, like conversations between friends, feelings and 
description of mood. The motivation of our work is keeping informative tweets 
that contain essential information, and filtering out useless tweets. Real 
applications that can benefit from our work are trend/topic detection 
applications, recommendation systems and applications that make predic- tions 
based on user messages on social media.

Challenges of processing tweet messages is that they are short messages, 
unstructured with unclear topic. We propose a weighted variation of the binary 
multinomial naive Bayes’ model to identify informative messages. We train our 
classifier and we evaluate results using 5-fold and 10-fold cross validation. 
We compare the results with the original binary multinomial naive Bayes’ model. 
We use two independent datasets of tweet messages crawled from the web. We 
evaluate and present our results using the following metrics: accuracy, recall, 
specificity, F-measure with its variations (F2 score and F0.5 score).

Concerning topic detection, the existing solutions overlook time and location 
factors, which are quite important and useful. Moreover, social media are 
frequently updated. Thus, the proposed detection model should handle the 
dynamic updates. We introduce a topic model for topic detection that combines 
time and location. Our model is equipped with incremental estimation of the 
parameters of the topic model and adaptive window length according to the 
correlation of consecutive windows and their density. We have conducted 
extensive experiments to verify the effectiveness and efficiency of our 
proposed Incremental Adaptive Time Location (IncrAdapTL) model.

Date:			Thursday, 4 October 2018

Time:                  	9:00am - 11:00am

Venue:                  Room 5562
                         (lifts 27/28)

Committee Members:	Prof. Lei Chen (Supervisor)
 			Dr. Xiaojuan Ma (Chairperson)
 			Dr. Qiong Luo
 			Dr. Wei Wang

**** ALL are Welcome ****