Visual Analytic Approaches to Mining Large-scale Time-Evolving Graphs

Project Start Date: Jul 1, 2013
Funding: Member Funded
Project Tags: ,

Project Summary

Objectives: The primary goal of this project is to develop a visual analytic framework for mining large-scale time evolving graphs. The development of the framework had three objectives: (1) we developed a methodology to construct a dependency graph using standard association analysis techniques to understand relationships between various entities (2) we developed a prediction model to predict event trends from evolutionary (or temporal) graphs, where individual nodes have non-stationary correlations, and (3) we investigated the application of emerging interaction and display devices for visual analytics interfaces.
Methods: Objective 1 was accomplished by a combination of hierarchical clustering, and association mining to understand risky customers, and product dependency. Objective 2 was accomplished by developing a prediction model for time-evolving graphs; a novel two-phase forecasting system was developed and implemented for predicting influenza trends. Objective 3 has two parts – a standard web-based visualization of predictive analytics which was developed using D3.js library, and a formal evaluation of the Handymenu menu system as an integrated portion of Handymap. The design of our evaluation follows an iterative and constructive methodology. Multi-touch graph interaction allows users to visualize data from our project colleagues. While building on the Handymap application, graph interaction expands beyond the need for geo-located datasets, a primary focus of our Year 3 project.

Results: Two application areas are investigated as part of this project. An influenza forecasting model was developed that includes environmental conditions (temperature, sun exposure) and influenza history. Initial results using Google Flu Trends (GTF) data reveal that the proposed model outperforms existing time series models available in literature. The second dataset uses sales transaction data to understand product purchase patterns, and at risk customers. The visualization work involved extensions of Year 1’s Handymap interface for sensor data exploration and the introduction of multi-touch displays and basic graph representation. A webbased dashboard kind of interface was also developed using D3.js where a user can get multiple views of the data in a map over time along with summarized data in the form of stacked line charts, etc.
Conclusion: The development of visual analytic framework involved development in three different aspects. (1) Constructing a dependency graph from sales transaction data. This dependency graph provides the ability to analyze dependencies between product categories, customers and products to identify critical products, at-risk customers. The results need to be evaluated with a domain expert. This graph can be used to build recommendation graphs, targeted customer marketing based on purchase behavior, etc. (2) The influenza forecasting model performs better with a two-phase vectorized time series model compared to existing methods discussed in literature. The forecasting model can be applied for any dataset arriving from a network of sensors. (3) The handymap application provided new insights for graph interaction. The tools developed would be used in Year 3 project.