Solving Data Integration and Inter-data Relationships from a Wide Variety of Data Sources – 9a.001.UL

Project Start Date: Aug 1, 2020
Research Areas: Analytics, Visualization
Funding: Member Funded
Project Tags: ,

Project Summary

In this project we plan to utilize state of the art Machine Learning (ML) based data fusion methods and workflows to optimize the level of information obtained from the plethora of available environmental datasets. Currently, the state of Louisiana is pursuing a statewide modeling and monitoring effort for flood applications. This statewide effort can greatly benefit from methods that can reduce the amount of confusion resulting from the availability of the large amounts of data that will be collected by field sensors and generated by hydrologic and hydraulic models. After a thorough literature review of the current state of technology, it is clear to us that there are gaps to fill in order to make the best use of the available data fusion methods for environmental applications in general and flood modeling in particular. Very few studies explored multimodal (e.g., different sensors and different spatiotemporal resolutions) data fusion methods using the latest advances in ML in environmental sciences applications. These studies are limited to certain applications such as land cover classification. In addition, a much smaller set of these studies utilized cooperative multimodal data fusion (i.e., combining information from multiple independent sources into a new more complex type of information) using ML methods such as Support Vector Machine (SVM) and ranking SVM tools. In this project we plan to focus on feature based data fusion analyses. This will allow us to produce datasets that are free of redundant information and where the input datasets are ranked and prioritized based on the information they contribute to the consequent ML based prediction models. As an example, these methods will be able to fuse multimodal data obtained from different sensors (e.g., satellite and ground observations) and different types (e.g., model outputs at different spatial and temporal resolutions).