The objective of this project was to detect the onset of events using social media, especially twitter streams. To accomplish this, we (a) developed a new method to detect the onset of events, (b) studied topic evolution, which takes unstructured time stamped data as input and identify, in an unsupervised manner, the latent topics and how these topics evolve and (c) generate a simplistic visualization of the arrival tweets.
The Event Detection on Onset (EDO) method employs a divergence score, a series of thresholds, a graph representation, a graph clustering method, and an event/topic evolution model to detect an onset of events. The topic evolution study focused on HDP, hierarchical Dirichlet process topic models, which our studies confirm is superior to alternative traditional methods. We also developed a simple visualization display of the arriving tweets, which moves as time progresses.
The EDO methods was able to identify the onset of events around 3 minutes after the first tweet about a disaster related event appeared; it also can be used in other domains to find changes in patterns. The method improved was superior in performance in comparison to an equivalent state-of-art method. For the HDP work, a study confirmed the HDP’s superiority and a timeflow visualization was prototyped with open-source D3 and Java. Finally, a preliminary prototype for tweet visualization was developed, but it needs to connect to the live source of tweets.