Stefan Bosse: Self-adaptive Traffic and Logistics Flow Control using Learning Agents and Ubiquitous Sensors

Self-adaptive Traffic and Logistics Flow Control using Learning Agents and Ubiquitous Sensors

Stefan Bosse

University of Bremen, Dept. Mathematics & Computer Science, Bremen, Germany

sbosse@uni-bremen.de

1 / 25

Stefan Bosse: Self-adaptive Traffic and Logistics Flow Control using Learning Agents and Ubiquitous Sensors

Introduction

Motivation and Objectives

This work demonstrates the benefits of agent-based simulation and learning agents for the self-organised and decentralised optimisation of traffic and logistics flows in cities

2 / 25

Stefan Bosse: Self-adaptive Traffic and Logistics Flow Control using Learning Agents and Ubiquitous Sensors

Methods and Paradigms

This work addresses three paradigms to create smart city control:

Cooperating and interacting Multi-agent Systems;
Reinforcement Learning (RL);
- Rule-based control + Unsupervised Learning
Self-organisation and self-adaptivity.

3 / 25

Stefan Bosse: Self-adaptive Traffic and Logistics Flow Control using Learning Agents and Ubiquitous Sensors

Traffic Flow Control

Traffic and logistics flow control should be achieved on three levels:

Ensemble control by the environment
- using traffic signals and signs;
- using common traffic control algorithms based on sensor data (collected, e.g., by street cameras);
- using centralised logistics scheduling and planning systems;
Individual control by driving entities. e.g.,
- influencing routing and decision making of individuals via social media or navigation systems;
- influencing (long-range) routing by recommender systems and learning navigation agents

4 / 25

Stefan Bosse: Self-adaptive Traffic and Logistics Flow Control using Learning Agents and Ubiquitous Sensors

Traffic Flow Control

Local automatic group control, e.g.,
- vehicle-to-vehicle communication and control using local interaction agents;
- world-to-vehicle and vehicle-to-world communication and control using local interaction agents.

5 / 25

Stefan Bosse: Self-adaptive Traffic and Logistics Flow Control using Learning Agents and Ubiquitous Sensors

Agents

There are physical and computational agents handled by one unified agent model

Physical Agents. Coupled to mobile platform and representing physical entities (humans, vehicles, products, machines, ..)

Computational Agents. Representation and implementation of mobile software

6 / 25

Stefan Bosse: Self-adaptive Traffic and Logistics Flow Control using Learning Agents and Ubiquitous Sensors

General Agent Behaviour Model

Reactive activity- and state-based Agents: Behaviour of agent is partitioned into activities composing an activity-transition graph (ATG)

#datg

Activity-Transition Graph (ATG) behaviour and data model of an agent for a specific class AC (left). Physical and computational agents differ in their action set (middle). Physical and computational agents can communicate with each other (right)

7 / 25

Stefan Bosse: Self-adaptive Traffic and Logistics Flow Control using Learning Agents and Ubiquitous Sensors

Learning Agent Architecture

Based on the ATG model combined with Reinforcement Learning

#agentmod

The proposed hybrid parameterised agent architecture combining reactive state-based action selection with RL

8 / 25

Stefan Bosse: Self-adaptive Traffic and Logistics Flow Control using Learning Agents and Ubiquitous Sensors

Functional Agent Model

The hybrid agent model consists of a set of connected functions Φ{f_i}:

$%% Tex percept: Sen \times Per \rightarrow Per \\ next : St \times Per \times Par \times R \times C \rightarrow St \\ action : St \times Par \rightarrow Act_1 \\ rl : r \times Per \times \rightarrow Act_2 \\ reward : Act_2 \times Per \times Par \rightarrow r [-1,1] \\ fusion : Act_1 \times Act_2 \rightarrow Act$

9 / 25

Stefan Bosse: Self-adaptive Traffic and Logistics Flow Control using Learning Agents and Ubiquitous Sensors

Utility Function

The utility function u(S): S → r provides the necessary input for the reinforcement learning instance (reward function)
The utility function uses a set of internal and external state variables S derived from sensor input (Per):

type states S = {
  v0:  Normalised average speed,
  ds0: Distances to s={front, back, 
       left, right} 
       neighbour vehicles,
  de, Δde: Distance to destination and 
       delta change (progress),
  td:  Direction to destination 
       (0-360 degree),
  r0:  Direction of vehicle 
       (0-360 degree),
  qt0: Queuing time,
  dd:  Allowed driving and turning 
       directions,
  P:   Set of possible paths from 
       current position to destination
}

10 / 25

Stefan Bosse: Self-adaptive Traffic and Logistics Flow Control using Learning Agents and Ubiquitous Sensors

Multi-agent System

Vehicle Agent → Rule-based short-range navigation
Navigation Agent → Reinforcement Learning of long-range navigation for path length and time optimisation
Traffic Control Agent → Flow sensing and control (traffic lights and signs)

#sensors1

Environmental sensors used by agents and agent interaction

11 / 25

Stefan Bosse: Self-adaptive Traffic and Logistics Flow Control using Learning Agents and Ubiquitous Sensors

Agent Behaviour

Coupled vehicle and navigation agents perform different actions:

Vehicle Agent

Act = [
  Moving one step 
    left, right, backward, 
    or ahead: |Δ|=1,
  Satisfy distance constraints:
  Increasing or decreasing the 
    vehicle speed,
  Follow short-range Δ displacement 
    vector (minΔ), 
  Stopping movement
]

Act = [
  Change direction to N/S/W/E,
  Keep direction (forced),
  Change vehicle speed,
  Change destination,
  Escape blocking situations
]

12 / 25

Stefan Bosse: Self-adaptive Traffic and Logistics Flow Control using Learning Agents and Ubiquitous Sensors

Simulation

Although the learning and adaptive long-range traffic routing can be deployed in real world, simulation is used to investigate training and impact of the proposed approach

13 / 25

Stefan Bosse: Self-adaptive Traffic and Logistics Flow Control using Learning Agents and Ubiquitous Sensors

Simulation Environment for JAM

Core component: The JavaScript Agent Machine is a portable and powerful agent processing platform written in JavaScript and capable to execute JavaScript agents

SEJAM extends the agent platform JAM with a simulation and visualisation layer
SEJAM support the concept of closed-loop simulation for augmented virtuality
Mobile and non-mobile devices executing the JAM platform can be connected with the virtual simulation world (via the Internet)

14 / 25

Stefan Bosse: Self-adaptive Traffic and Logistics Flow Control using Learning Agents and Ubiquitous Sensors

Simulation Environment for JAM

#simulation0 Virtual simulation world (Simulator SEJAM) coupled to real worlds via the Internet and unified agent processing platforms (JAM)

15 / 25

Stefan Bosse: Self-adaptive Traffic and Logistics Flow Control using Learning Agents and Ubiquitous Sensors

Live Demonstration

16 / 25

Stefan Bosse: Self-adaptive Traffic and Logistics Flow Control using Learning Agents and Ubiquitous Sensors

The Simulation Model

The simulation model and the agents are programmed entirely in JavaScript
It consists of
- the definition of the agent behaviour code, and
- the definition of the simulation world
Physical and computational agents can be modelled the same way!
- Each agent is modelled by agent class function
- The simulation model declares the agent type (physical vs. computational)
- Phyiscal agents are associated with geometric shapes for visualisation

17 / 25

Stefan Bosse: Self-adaptive Traffic and Logistics Flow Control using Learning Agents and Ubiquitous Sensors

The Simulation Model

// Agent Class Constructors
function world (options) { 
  this.XX=xx this.act={} this.trans={} this.on={} this.next=init }
function vehicle (options) { 
  this.XX=xx this.act={} this.trans={} this.on={} this.next=init }
function navigator (options) { 
  this.XX=xx this.act={} this.trans={} this.on={} this.next=init }
// Simulation World Model Descriptor
model = {
  agents : { world : { behaviour:world, visual : { .. }, .. },
  resources : { street : { .. }, place : { .. }, signal : { .. }, .. },
  nodes : { world : { .. }, vehicle : { .. } , navigator : { .. }, 
  parameter : { .. },
  world : { init : {..}, map : {..}, resources : {..}, patchgrid : {..} } 
}

18 / 25

Stefan Bosse: Self-adaptive Traffic and Logistics Flow Control using Learning Agents and Ubiquitous Sensors

The Simulation World

Ressources

Streets
Crossings
Free areas
Traffic light signs
Vehicles

Agents

Physical: Vehicle∩Navigation, Traffic
Computational: World

19 / 25

Stefan Bosse: Self-adaptive Traffic and Logistics Flow Control using Learning Agents and Ubiquitous Sensors

The Simulation Live!

SEJAM2 (Learning Navigation Agents)

20 / 25

Stefan Bosse: Self-adaptive Traffic and Logistics Flow Control using Learning Agents and Ubiquitous Sensors

Results and Conclusions

21 / 25

Stefan Bosse: Self-adaptive Traffic and Logistics Flow Control using Learning Agents and Ubiquitous Sensors

Results

Two cases were evaluated: (1) Non pre-trained agents and (2) pre-trained agents with a seed of of 4 different agents selected from (1)
Both cases show a monotonic increase of globally averaged path efficiency (time and path length)

#resultsX

Path eff.: (Left) Without pre-trained agents (Right) With pre-trained agents

22 / 25

Stefan Bosse: Self-adaptive Traffic and Logistics Flow Control using Learning Agents and Ubiquitous Sensors

The averaged global learning progress (averaged reward) increases monotonically, too.

#plots01

Progress of global reward without pre-trained agents

23 / 25

Stefan Bosse: Self-adaptive Traffic and Logistics Flow Control using Learning Agents and Ubiquitous Sensors

Conclusions

In contrast to common traffic management controlling traffic lights and signals only, this work addressed traffic flow optimisation on micro-level by adapting decision making processes of vehicles, primarily long-range navigation and re-routing, optionally with vehicle speed control.
Training of reinforcement learning navigation agents by thousands of trial-and-error cycles requires a long time to reach a satisfying navigation strategy better than random walk and is only possible in simulation worlds. Otherwise domestic traffic would collapse if performed in real world.
Simulation results from an agent-based simulation of an artificial urban area show that the deployment of such a micro-level vehicle control just by individual decision making, learning, and re-routing based on local environmental sensors can reach near optimal routing still under high traffic densities (regarding total route length and travelling times).

24 / 25

Stefan Bosse: Self-adaptive Traffic and Logistics Flow Control using Learning Agents and Ubiquitous Sensors

Thank You!

Thank you for your attention. All questions are welcome!

Further information can be found here:
- http://edu-9.de
- http://sblab.de

#evol

#me

25 / 25