Stefan Bosse - Surrogate Predictive and Multi-domain Modelling of Complex Systems
Challenges and Pitfalls
PD Dr. Stefan Bosse
sbosse@uni-bremen.de
University of Bremen, Dept. Mathematics and Computer Science, Bremen, Germany
Introduction Stefan Bosse - Surrogate Predictive and Multi-domain Modelling of Complex Systems
Topic
Modelling and Simulation of complex dynamic systems like pandemic outbreaks or traffic flows in cities
Derivation of macro-level aggregate variables (observables) from analytical models and simulation on micro-level
Prediction of time-dependent aggregate variables by combination of Machine Learning and Simulation
Coupling of simulation with real-world environments including digital twin methodology
Introduction Stefan Bosse - Surrogate Predictive and Multi-domain Modelling of Complex Systems
Issues
A high variance on entity micro-level and unknown or incomplete entity interaction models
Lack of sensor and model calibrations ⇒ affecting functional modelling and simulation
Accessibility of real-world sensor data
High dimensionality, size, and distortion (bias, test coverage, skewness of distributions) of real-world data
High number of micro-level entities in simulation (for statistical strength) ⇒ Computaionial complexity and time
Introduction Stefan Bosse - Surrogate Predictive and Multi-domain Modelling of Complex Systems
Introduction Stefan Bosse - Surrogate Predictive and Multi-domain Modelling of Complex Systems
Introduction Stefan Bosse - Surrogate Predictive and Multi-domain Modelling of Complex Systems
The combination of Machine Learning and simulation can improve model and simulation quality in different ways:
Machine Learning assisted simulation improving the simulation model and quality;
Simulation assisted Machine Learning improving the prediction or classification model;
Emulation of the multi-agent behaviour model by an ML derived macro-level model (surrogate modelling);
Model and sensor calibration using ML.
Hybrid Methodology Stefan Bosse - Surrogate Predictive and Multi-domain Modelling of Complex Systems
The major issue with real-world coupled simulations and predictive machine modelling from simulation is the discrepancy of sensor data (input and output observables) collected in real and simulation domains.
Hybrid Methodology Stefan Bosse - Surrogate Predictive and Multi-domain Modelling of Complex Systems
Hybrid MAS-CA simulation featuring:
Hierarchical domain-specific simulation and decomposition (with respect to longitudinal and spatial scale);
Hybrid Methodology Stefan Bosse - Surrogate Predictive and Multi-domain Modelling of Complex Systems
Predictive modelling of time-series data of aggregate variables using state-based ML models trained on real-world and simulation data.
Simulation augments or replaces real-world data
Augmented data is used to train predictive models, e.g., infection rate development
Sequential Time-series predicitive models for surrogate observable variables with auxiliary sensor variables
Models are applied to real-world data predicting finally real-world observables!
Hybrid Methodology Stefan Bosse - Surrogate Predictive and Multi-domain Modelling of Complex Systems
Simulation on micro-level with computated macro-level observables using sensors (S) and test probing
Hybrid Methodology Stefan Bosse - Surrogate Predictive and Multi-domain Modelling of Complex Systems
The hybrid overall architecture and methodology for predictive surrogate modelling of time-dependent system observables (here infection rates of a pandemic situation) using state-based machine learning models
Hybrid Methodology Stefan Bosse - Surrogate Predictive and Multi-domain Modelling of Complex Systems
The aggregated data collected from simulation is used to train a surrogate machine model for time-series prediction.
A state-based Long-Short Term Memory (LSTM) artificial neural network architecture was chosen for time-series prediction
A LSTM network is able to predict a time-dependent variable x(n) for a future sample point n+Δ with past data {x(1),..,x(n)}.
Case-study: Pandemic Outbreak Stefan Bosse - Surrogate Predictive and Multi-domain Modelling of Complex Systems
Goal: Future prediction of system observable infection rate (or cases) from past data with a machine model trained with real and/or simulation data
The main issue with pandemic data bases is the high bias and distortion of sampled population data (infection cases) due to uncalibrated sensors and unknown test strategy (cross section)
Developing time-series prediction models for pandemic observables from population data on a long-term time scale is nearly impossible!
Case-study: Pandemic Outbreak Stefan Bosse - Surrogate Predictive and Multi-domain Modelling of Complex Systems
Simulation world (Germany) partitioned into 38 TUs (NUTS level 2) mapped on 38 CA worlds, Cartesian coordinates, not ratio scaled. Size of CA grid is related to TU domain size and population density.
Case-study: Pandemic Outbreak Stefan Bosse - Surrogate Predictive and Multi-domain Modelling of Complex Systems
Each TU is simulated with a CA partitioned in sub-domain areas
Case-study: Pandemic Outbreak Stefan Bosse - Surrogate Predictive and Multi-domain Modelling of Complex Systems
Time-series prediction by a machine learning LSTM model trained with population data and predicting the future development of infection rate
There is a set of independent predictive models M={md}d=138, one for each terrestrial unit domain (TU)
Data: Population data from Robert Koch Institute (uncalibrated, as-is data)
Case-study: Pandemic Outbreak Stefan Bosse - Surrogate Predictive and Multi-domain Modelling of Complex Systems
Evaluation strategies
Case-study: Pandemic Outbreak Stefan Bosse - Surrogate Predictive and Multi-domain Modelling of Complex Systems
Training of predictive time-series model for observable infection rate for one TU (Bremen) over full longitudinal range (52 weeks); y0: original data, y: predicted data
Case-study: Pandemic Outbreak Stefan Bosse - Surrogate Predictive and Multi-domain Modelling of Complex Systems
Training of predictive time-series model for infection rate observable for one TU (Bremen) over partial longitudinal range (40 weeks); y0: original data, y: predicted data
Case-study: Pandemic Outbreak Stefan Bosse - Surrogate Predictive and Multi-domain Modelling of Complex Systems
Application of predictive time-series model TU Bremen for infection rate observable to TU Koblenz; y0: original data, y: predicted data
Case-study: Pandemic Outbreak Stefan Bosse - Surrogate Predictive and Multi-domain Modelling of Complex Systems
There is a set of independent simulations U={ud}d=138 and predictive models MU={mdu}d=138, one for each terrestrial unit domain (TU)
Each TU is represented by an agent controlling a Cellular Automata (CA):
Case-study: Pandemic Outbreak Stefan Bosse - Surrogate Predictive and Multi-domain Modelling of Complex Systems
Training of predictive time-series model for infection rate observable for simulated TU (Bremen) over full and partial longitudinal range (70/30 weeks); y0: original data, y: predicted data
Summary Stefan Bosse - Surrogate Predictive and Multi-domain Modelling of Complex Systems
Prediction of time-dependent population observables from domestical population data mostly fails due to uncalibrated and distorted sensors
Simulation of large-scale populations like in pandemic situations is a challenge due to high number of entities and high degree of domain-dependent and individual behaviour variance typically not covered
Digital Twin concepts can improve simulation by introducing micro-level variance
Surrogate modelling by using simulation data replace computational complex agent-based simulations
A hybrid simulation model of agent-based and probabilistic cellular automata methodologies is a good trade-off
Spatial domain paritioning can further improve prediction accuracy
Summary Stefan Bosse - Surrogate Predictive and Multi-domain Modelling of Complex Systems
Questions and Comments are welcome!
Surrogate Predictive and Multi-domain Modelling of Complex Systems by fusion of Agent-based Simulation, Cellular Automata, and Machine Learning
Challenges and Pitfalls
PD Dr. Stefan Bosse
sbosse@uni-bremen.de
University of Bremen, Dept. Mathematics and Computer Science, Bremen, Germany