The value of discovered mining results in data streams depreciates quickly. Thus, capability of online mining is crucial for anomaly detection and dynamic association mining in streaming data. Association rule mining helps uncover relationships between seemingly unrelated data. Most anomaly detection methods focus on point anomalies, whilst in practice; many fraudulent behaviors could be detected only through collective analysis of sequences of data. Techniques such as in-memory database, sliding windows, dynamic modeling helps to achieve the real-time requirement. The objectives of this project are to:
1. Develop scalable algorithms for processing large volumes of data for the mining of association rules over time frames. It would address issues with data processing latency that results in data depreciating in value.
2. Develop distributed batch processing algorithms for building a model using high-volume historical data available. The model should capture existing sequence patterns in data. To overcome concept-drift problems, the model should be frequently updated.
3. Develop online distributed stream processing algorithms for comparing continuously fast incoming data with the model, evaluating collective anomalies, and updating the model if needed.