Markov-game modeling of cyclist-pedestrian interactions in shared spaces: a multi-agent adversarial inverse reinforcement learning approach

Alsaleh, Rushdi; Sayed, Tarek

doi:10.1016/j.trc.2021.103191

SAFETYLIT WEEKLY UPDATE

We compile citations and summaries of about 400 new articles every week.

RSS Feed

HELP: Tutorials | FAQ

CONTACT US: Contact info

Search Results

Journal Article

Markov-game modeling of cyclist-pedestrian interactions in shared spaces: a multi-agent adversarial inverse reinforcement learning approach
Citation	Alsaleh R, Sayed T. Transp. Res. C Emerg. Technol. 2021; 128: e103191.
Copyright	(Copyright © 2021, Elsevier Publishing)
DOI	10.1016/j.trc.2021.103191
PMID	unavailable
Abstract	Understanding and modeling road user dynamics and their microscopic interaction behaviour at shared space facilities are curial for several applications including safety and performance evaluations. Despite the multi-agent nature of road user interactions, the majority of previous studies modeled their interactions as a single-agent modeling framework, i.e., considering the other interaction agents as part of the passive environment. However, this assumption is unrealistic and could limit the model's accuracy and transferability in non-stationary road user environments. This study proposes a novel Multi-Agent Adversarial Inverse Reinforcement Learning approach (MA-AIRL) to model and simulate road user interactions at shared space facilities. Unlike the traditional game-theoretic framework that models multi-agent systems as a single time-step payoff, the proposed approach is based on Markov Games (MG) which models road users' sequential decisions concurrently. Moreover, the proposed model can handle bounded rationality agents, e.g., limited information access, through the implementation of the Logistic Stochastic Best Response Equilibrium (LSBRE) solution concept. The proposed algorithm recovers road users' multi-agent reward functions using adversarial deep neural network discriminators and estimates their optimal policies using Multi-agent Actor-Critic with Kronecker factors (MACK) deep reinforcement learning. Data from three shared space locations in Vancouver, BC and New York City, New York are used in this study. The model's performance is compared to a baseline single-agent Gaussian Process Inverse Reinforcement Learning (GPIRL). The results show that the multi-agent modeling framework led to a significantly more accurate prediction of road users' behaviour and their evasive action mechanisms. Moreover, the recovered reward functions based on the single-agent modeling approach failed to capture the equilibrium solution concept similar to the multi-agent approach. Language: en
Keywords	Cyclists and pedestrians; Microsimulation; Multi-agent models; Reward functions; Shared space modeling