You are here

Topic Detection in Twitter Using Topology Data Analysis

Authors: 

Pablo Torres Tramon, Hugo Hromic, Bahareh Rahmanzadeh Heravi

Publication Type: 
Refereed Conference Meeting Proceeding
Abstract: 
The massive volume of content generated by social media greatly exceeds human capacity to manually process this data in order to identify topics of interest. As a solution, various automated topic detection approaches have been proposed, most of which are based on document clustering and burst detection. These approaches normally represent textual features in standard n-dimensional Euclidean metric spaces. However, in these cases, directly filtering noisy documents is challenging for topic detection. Instead we propose Topol, a topic detection method based on Topology Data Analysis (TDA) that transforms the Euclidean feature space into a topological space where the shapes of noisy irrelevant documents are much easier to distinguish from topically-relevant documents. This topological space is organised in a network according to the connectivity of the points, i.e. the documents, and by only filtering based on the size of the connected components we obtain competitive results compared to other state of the art topic detection methods.
Conference Name: 
SoWeMine Workshop @ ICWE 2015
Digital Object Identifer (DOI): 
10.1007/978-3-319-24800-4_16
Publication Date: 
23/06/2015
Volume: 
9396
Pages: 
186–197
Conference Location: 
Netherlands
Research Group: 
Institution: 
National University of Ireland, Galway (NUIG)
Open access repository: 
Yes