Trustworthy safety improvement for autonomous driving using reinforcement learning

Cao, Zhong; Xu, Shaobing; Jiao, Xinyu; Peng, Huei; Yang, Diange

doi:10.1016/j.trc.2022.103656

SAFETYLIT WEEKLY UPDATE

We compile citations and summaries of about 400 new articles every week.

RSS Feed

HELP: Tutorials | FAQ

CONTACT US: Contact info

Search Results

Journal Article

Trustworthy safety improvement for autonomous driving using reinforcement learning
Citation	Cao Z, Xu S, Jiao X, Peng H, Yang D. Transp. Res. C Emerg. Technol. 2022; 138: e103656.
Copyright	(Copyright © 2022, Elsevier Publishing)
DOI	10.1016/j.trc.2022.103656
PMID	unavailable
Abstract	Reinforcement learning (RL) can learn from past failures and has the potential to provide self-improvement ability and higher-level intelligence. However, the current RL algorithms still suffer from challenges in reliability, especially compared to the rule/model-based algorithms that are pre-engineered, human-input intensive, but widely used in autonomous vehicles. To take advantages of both the RL and rule-based algorithms, this work aims to design a decision-making framework that leverages RL and use an existing rule-based policy as its performance lower bound. In this way, the final policy remains the potential of self-learning, while guaranteeing a better system performance compared with the integrated rule-based policy. Such a decision-making framework is called trustworthy improvement RL (TiRL). The basic idea is to make the RL policy iteration process synchronously estimate the given rule-based policy's value function. AV will then use the RL policy to drive only in the cases where the RL has learned a better policy, i.e., a higher policy value. This work takes highway safe driving as the case study. The results are obtained through more than 42,000 km driving in stochastic simulated traffic, and calibrated by naturalistic driving data. The TiRL planner is given two typical rule-based highway-driving policies for comparison. The results show that the TiRL can outperform the given arbitrary rule-based driving policy. In summary, the proposed TiRL can leverage the learning-based method in stochastic and emergent scenarios, while having a trustworthy safety improvement from the existing rule-based policies. Language: en
Keywords	Autonomous Vehicle; Driving Safety; Reinforcement Learning