Due to recent advances in the fields of environmental monitoring and modeling and hydro-informatics, a plethora of environmental observations are being continuously collected by various sensors (e.g., spaceborne sensors, ground radars, field sensors, social media data, and IoT sensors). For decades, these observations were being used individually to enhance our understanding of certain processes, or to help us with our efforts to model and forecast the occurrence of certain events (e.g., floods and other natural disasters). Nevertheless, the amount of environmental observations that can currently easily fall under the category “Big Data” are yet to be integrated and merged within a predictive analytics approach in such a way that makes them understandable and easy to be interpreted and used by decision makers and the public for applications such as emergency evacuations, road closures, and access mapping for areas that have been devastated by natural disasters.
In this project we propose a data analytics approach, which we will apply in the field of flood forecasting and warning systems as our case study. Nevertheless, this approach should be scalable and transferable to inform other applications. Mainly, we plan to develop methodologies for data integration and discovery of complex inter-data relationships that will leverage remotely sensed and ground based observations, and the latest Machine Learning and MI approaches with the aim of answering the following research questions:
How to develop AI schemes that are efficiently capable of establishing ontology-based data integration of environmental drivers variables (e.g., observed rainfall, antecedent soil moisture, land cover classification, and upstream water levels) and response variables (e.g., observed downstream water levels, observed flood extent, and numerical model outputs)?
How to utilize these schemes in establishing data-based predictive capabilities in flood applications in the short term (e.g., extreme events and large scale local projects) as well as in long term (e.g., climate change and anthropogenic manipulation of the landscape)?
What is the predictability power of the integrated catalog of multi-scale, multi-resolution data through machine learning methodologies, beyond what can be derived from a single data source, or a sub-group of sources?
How can the large number of datasets/variables be reduced to a more restricted set with predictive power?