PD Stefan Bosse - Long-term Longitudinal data collection and analysis using Agents

Long-term Longitudinal data collection and analysis in highly dynamic systems using mobile Crowd Sensing and mobile Agents

Challenges and Issues

PD Dr. Stefan Bosse


University of Bremen, Dept. Mathematics and Computer Science, Bremen, Germany

1 / 31

PD Stefan Bosse - Long-term Longitudinal data collection and analysis using Agents


This work addresses longitudinal data collection and aggregation that can be used for:

  1. Data Analysis and Data Mining (statistical);
  2. Data- and Event-driven Simulation;
  3. Automated Prediction and Classification using Machine Learning (ML is kind of simulation = Extrapolation);
  4. Time-series analysis and prediction.
2 / 31

PD Stefan Bosse - Long-term Longitudinal data collection and analysis using Agents


This work addresses longitudinal data collection and aggregation that can be used for:

  1. Data Analysis and Data Mining (statistical);
  2. Data- and Event-driven Simulation;
  3. Automated Prediction and Classification using Machine Learning (ML is kind of simulation = Extrapolation);
  4. Time-series analysis and prediction.

All four domains depend on the strength and statistical quality on the vertical and horizontal (longitudinal time) scale!

Incremental longitudinal data sampling is a challenge!

3 / 31

PD Stefan Bosse - Long-term Longitudinal data collection and analysis using Agents

Longitudinal Surveys

Typical applications of classical longitudinal surveys are (Lynn, 2009):

  • Surveys of businesses
  • Surveys of school-leavers, graduates or trainees
  • Household panel surveys
  • Birth cohort studies
  • Epidemiological studies
  • Social Networking
  • Socio-technical Systems
4 / 31

PD Stefan Bosse - Long-term Longitudinal data collection and analysis using Agents

Longitudinal Surveys

Typical applications of classical longitudinal surveys are (Lynn, 2009):

  • Surveys of businesses
  • Surveys of school-leavers, graduates or trainees
  • Household panel surveys
  • Birth cohort studies
  • Epidemiological studies
  • Social Networking
  • Socio-technical Systems

Surveys are typically participatory and rely on models and survey plans

5 / 31

PD Stefan Bosse - Long-term Longitudinal data collection and analysis using Agents

Longitudinal Surveys

Typical applications of classical longitudinal surveys are (Lynn, 2009):

  • Surveys of businesses
  • Surveys of school-leavers, graduates or trainees
  • Household panel surveys
  • Birth cohort studies
  • Epidemiological studies
  • Social Networking
  • Socio-technical Systems

Surveys are typically participatory and rely on models and survey plans

Crowdsensing is typically opportunistic and self-organizing

6 / 31

PD Stefan Bosse - Long-term Longitudinal data collection and analysis using Agents

Longitudinal Data Sampling

(a) Traditional survey-based data sampling and static modelling using pariticipatory mechanisms (b) Continuous crowdsensing based data-driven modelling using opportunistic mechanisms

7 / 31

PD Stefan Bosse - Long-term Longitudinal data collection and analysis using Agents

Longitudinal Data Sampling

Issues with Longitudinal Sampling:

  • Curse of Dimensionality of l. data (LD):


with: P: Persons, O:Occasions, L:Locations/Places, V: Variables, t:time

8 / 31

PD Stefan Bosse - Long-term Longitudinal data collection and analysis using Agents

Longitudinal Data Sampling

Issues with Longitudinal Sampling:

  • Curse of Dimensionality of l. data (LD):


with: P: Persons, O:Occasions, L:Locations/Places, V: Variables, t:time

  • Sampling in time space (horizontal axis)
    • periodically (polling);
    • event-based;
    • random.
  • Sampling in variable space (vertical axis)
    • Bias, Fraud,
    • Distortion, Noise, Failure
    • Missing data, impurity
9 / 31

PD Stefan Bosse - Long-term Longitudinal data collection and analysis using Agents

Errors in Longitudinal Data Sampling

Coverage Error

Sampling Error

Non-repsonse Error

Measurement Error

Lynn, 2009

10 / 31

PD Stefan Bosse - Long-term Longitudinal data collection and analysis using Agents

Errors in Longitudinal Data Sampling

Coverage Error

Sampling Error

Non-repsonse Error

Measurement Error

Lynn, 2009

(Mobile) Crowdsensing can help to reduce Coverage, Sampling, and Non-response Errors and to extend the data space with environmental/context sensor variables

11 / 31

PD Stefan Bosse - Long-term Longitudinal data collection and analysis using Agents

On-line vs. Off-line Data Mining and Machine Learning

(a) Off-line Surveys (b) On-line Longitudinal Data Mining

12 / 31

PD Stefan Bosse - Long-term Longitudinal data collection and analysis using Agents

On-line vs. Off-line Simulation

(a) Off-line data-driven ABS (b) On-line data- and event-driven ABS

13 / 31

PD Stefan Bosse - Long-term Longitudinal data collection and analysis using Agents

The Concept

An Unified Approach: Agents connect Real World & Simulation

Agent-based Modelling.
Agent-based Simulation.
Agent-based Computation.
Mobile agent-based Crowdsensing.
Machine Learning.
Surrogate Modelling.
14 / 31

PD Stefan Bosse - Long-term Longitudinal data collection and analysis using Agents

The Concept

An Unified Approach: Agents connect Real World & Simulation

Agent-based Modelling.
Agent-based Simulation.
Agent-based Computation.
Mobile agent-based Crowdsensing.
Machine Learning.
Surrogate Modelling.

Mobile Crowdsensing is: Event-driven or request-reply-based, uses mobile agents for sensor sampling (mobile devices) and performing micro surveys (dynamic/conditional scripts) via chat dialogs

15 / 31

PD Stefan Bosse - Long-term Longitudinal data collection and analysis using Agents

The Concept

Unified Agent Methodology for longitudinal data mining, modelling of complex systems, and simulation

16 / 31

PD Stefan Bosse - Long-term Longitudinal data collection and analysis using Agents

Agent-based On-line Simulation: Software Architecture

Bosse, Engel, 2019, Sensors Two agent classes are used: Physical simulation agents (red) and computational software agents (blue, Simulation and Real World)

17 / 31

PD Stefan Bosse - Long-term Longitudinal data collection and analysis using Agents

Time Machine

ABC Crowdsensing can be used 1. To update simulations in real-time ⇒ Variance by Digital Twins, 2. Fork simulation runs with time-compressing speed-up, and 3. Creating simulation snapshots for future world evolution ⇒ Weather Forecast

18 / 31

PD Stefan Bosse - Long-term Longitudinal data collection and analysis using Agents

Virtual Sensors

  • Agents can pose the following roles:
    • Physical agents in simulation;
    • Computational agents performing crowd sensing (physical sensors)
    • Computational agents performing sensor aggregation, event detection, and data reduction (virtual sensors) ⇒ Longitudinal Data Sampling
19 / 31

PD Stefan Bosse - Long-term Longitudinal data collection and analysis using Agents

Virtual Sensors

  • Agents can pose the following roles:
    • Physical agents in simulation;
    • Computational agents performing crowd sensing (physical sensors)
    • Computational agents performing sensor aggregation, event detection, and data reduction (virtual sensors) ⇒ Longitudinal Data Sampling

Virtual Sensors implemented by mobile or stationary agents are central part of the longitudinal data sampling and data reduction methodology (including calibration)

20 / 31

PD Stefan Bosse - Long-term Longitudinal data collection and analysis using Agents

Three sensing domains: (Left) Physical Sensors (Middle) Virtual Sensors (Right) Data Mining/Application

21 / 31

PD Stefan Bosse - Long-term Longitudinal data collection and analysis using Agents

Use-case: Pandemic Simulation and Time-series Prediction

  1. Goal: Time-series prediction of dynamic of infection cases in pandemic sitations

  2. Methodologies:

22 / 31

PD Stefan Bosse - Long-term Longitudinal data collection and analysis using Agents

Use-case: Pandemic Simulation and Time-series Prediction

  1. Goal: Time-series prediction of dynamic of infection cases in pandemic sitations

  2. Methodologies:

  1. Data Mining of already existing institutional longitudinal data and Machine Learning (time-series extrapolation)
23 / 31

PD Stefan Bosse - Long-term Longitudinal data collection and analysis using Agents

Use-case: Pandemic Simulation and Time-series Prediction

  1. Goal: Time-series prediction of dynamic of infection cases in pandemic sitations

  2. Methodologies:

  1. Data Mining of already existing institutional longitudinal data and Machine Learning (time-series extrapolation)
  1. Surrogate Modelling of ABS using data from simulation and auxiliary data from mobile crowd sensing (crowd behaviour, decision making, opinions) / Simulation seed with data from a.
24 / 31

PD Stefan Bosse - Long-term Longitudinal data collection and analysis using Agents

Institutional Data Mining and Machine Prediction

  • Data from Robert Koch Institute (Weekly infection cases notifications)
  • Time-series prediction by LSTM-ANN
  • Biased and distorted/uncalibrated sensor data (unknown test sampling over time)

Crowd-driven Simulation and Surrogate Modelling

  • Domain-paritioned parameterised Gas Cellular Automata simulation
  • Time-series prediction by LSTM-ANN
  • Calibrated sensor data from simulation
25 / 31

PD Stefan Bosse - Long-term Longitudinal data collection and analysis using Agents

Use Case: Seggregation Simulation

  1. Goal: Study of seggregation effects (cluster groups) with individual (variant) behaviour based on mobility and social networking

  2. Methodologies:

26 / 31

PD Stefan Bosse - Long-term Longitudinal data collection and analysis using Agents

Use Case: Seggregation Simulation

  1. Goal: Study of seggregation effects (cluster groups) with individual (variant) behaviour based on mobility and social networking

  2. Methodologies:

  1. Agent-based Simulation with parameterised mobility and interaction models
27 / 31

PD Stefan Bosse - Long-term Longitudinal data collection and analysis using Agents

Use Case: Seggregation Simulation

  1. Goal: Study of seggregation effects (cluster groups) with individual (variant) behaviour based on mobility and social networking

  2. Methodologies:

  1. Agent-based Simulation with parameterised mobility and interaction models

  2. Agent-based Crowdsensing performing micro surveys via mobile devices and chat dialogues finally creating digital twins introducing behaviour model variance.

28 / 31

PD Stefan Bosse - Long-term Longitudinal data collection and analysis using Agents

Closed Simulation with static agent behaviour

  • Static seggregation behaviour model
  • Group A/B cluster formation

Open Simulation with On-line Crowdsensing and Digital Twins

  • Digital Twins introduce behaviour variance on micro level
  • Global clustering outcome differs! Bosse, Engel, SSC 2019
29 / 31

PD Stefan Bosse - Long-term Longitudinal data collection and analysis using Agents


Longitudinal data sampling and analysis is a challenge with respect to

  • Bias, distortion, sampling intervals, impure variables, missing calibration
  • Dimensionality and Data Volume

Agent-based methods with an unified agent model features:

  • Mobile Crowdsensing sampling environmental and user data on micro scale level
  • Tight coupling of simulation (ABS) with real world (human-in-the-loop)
  • Incremental data collection by software agents synchronises simulation with real world
  • Simulation snapshots and forking enables prediction of future world evolution
30 / 31

PD Stefan Bosse - Long-term Longitudinal data collection and analysis using Agents

Any Questions?

Long-term Longitudinal data collection and analysis in highly dynamic systems using mobile Crowd Sensing and mobile Agents: Challenges and Issues

Challenges and Pitfalls

PD Dr. Stefan Bosse

sbosse@uni-bremen.de, www.ag-0.de

University of Bremen, Dept. Mathematics and Computer Science, Bremen, Germany

31 / 31