DeFIX

DeFIX: Detecting and Fixing Failure Scenarios with Reinforcement Learning in Imitation Learning Based Autonomous Driving

Paper | ArXiv | Code | Slides

Citation

@INPROCEEDINGS{dagdanov2022defix,
    author = {Dagdanov, Resul and Eksen, Feyza and Durmus, Halil and Yurdakul, Ferhat and Ure, Nazim Kemal},
    booktitle = {2022 IEEE 25th International Conference on Intelligent Transportation Systems (ITSC)},
    title = {DeFIX: Detecting and Fixing Failure Scenarios with Reinforcement Learning in Imitation Learning Based Autonomous Driving},
    year = {2022},
    volume={},
    number={},
    pages={4215-4220},
    doi={10.1109/ITSC55140.2022.9922209}
}

Abstract

Safely navigating through an urban environment without violating any traffic rules is a crucial performance target for reliable autonomous driving. In this paper, we present a Reinforcement Learning (RL) based methodology to DEtect and FIX (DeFIX) failures of an Imitation Learning (IL) agent by extracting infraction spots and re-constructing mini-scenarios on these infraction areas to train an RL agent for fixing the shortcomings of the IL approach. DeFIX is a continuous learning framework, where extraction of failure scenarios and training of RL agents are executed in an infinite loop. After each new policy is trained and added to the library of policies, a policy classifier method effectively decides on which policy to activate at each step during the evaluation. It is demonstrated that even with only one RL agent trained on failure scenario of an IL agent, DeFIX method is either competitive or does outperform state-of-the-art IL and RL based autonomous urban driving benchmarks. We trained and validated our approach on the most challenging map (Town05) of CARLA simulator which involves complex, realistic, and adversarial driving scenarios.

Method Overview

Stage-1: An initial dataset is generated by driving a rule-based CARLA autopilot to train an IL agent. This agent is continuously improved with DAgger approach. Stage-2: Mini-scenarios are constructed out of infraction and failure locations by evaluating DAgger cantilevered IL policy. At each continual loop of this stage, one different RL agent is trained on one of these mini-scenarios and added to the library of RL policies. In order to activate an IL agent or one of RL agents during evaluation, the policy classifier network is trained with supervised learning.

Network Architectures

Pre-trained ResNet-50 model is used as an image backbone to all proposed networks. The last layer of a ResNet-50 is reinitialized and trained from scratch during the training of brake and policy classifiers. Speed, orientation, location, and front camera data are obtained from sensors while sequential target locations are given priorly. Target locations are converted to the relative local coordinate frame by using vehicle position and orientation obtained from IMU and GPS sensors, respectively. In the supervised learning network, a 704-dimensional feature vector represents the fused information of sensor inputs, from there it is processed through Fully Connected layers. IL agent only decides when to apply a brake action while policy classifier decides which trained agent to activate during evaluation. ResNet-50 model is completely frozen during RL trainings. The state-space for the RL agent is a 1000-dimensional vector of ResNet backbone output. RL agents output high-level action commands (lane keeping, right and left lane changing, stopping). Low-level steering and throttle actions are determined with lateral and longitudinal PID controllers, respectively.

List of CARLA Scenarios

Scenario ID	Scenario Name
0	Dynamic Vehicle Collision
1	Emerging Pedestrian Collision
2	Stuck Vehicle & Static Objects
3	Vehicle Running Red Light
4	Crossing Signalized Traffic Intersections
5	Crossing Un-signalized Intersections

Stuck Vehicle Scenarios in Town05

Illustration of stuck vehicle scenarios in Town05. Blue boxed scene is a training scenario of the RL policy that aims to optimize an overtaking decision. Orange boxed scenes are the examples of evaluation scenarios.

Benchmark Comparison of Driving Performance Scores in Town05

Methods	RC - Short	DS - Short	RC - Long	DS - Long
LBC (reference)	55.01	30.97	32.09	7.05
Late Fusion (reference)	83.66	51.56	68.05	31.30
CILRS (reference)	13.40	7.47	7.19	3.68
AIM (reference)	81.07	49.00	60.66	26.50
TransFuser (reference)	78.41	54.52	56.36	33.15
NEAT (reference)	69.34	58.21	88.78	57.49
Geometric Fusion (reference)	86.91	54.32	69.17	25.30
World on Rails (reference)	52.60	38.14	60.57	32.18
DeFIX (ours)	96.34	72.41	89.61	39.42
IL Agent (ours)	90.10	68.47	75.40	33.17
RL Agent (ours)	30.14	24.65	5.17	4.15
Autopilot Agent (reference)	90.94	82.75	75.41	48.60

RC - Short : Route Completion Score of an Agent in Town05-Short
DS - Short : Driving Score of an Agent in Town05-Short
RC - Long : Route Completion Score of an Agent in Town05-Long
DS - Short : Driving Score of an Agent in Town05-Long

Name		Name	Last commit message	Last commit date
Latest commit History 79 Commits
carla @ 8854804		carla @ 8854804
data		data
defix		defix
figures		figures
imitation-learning @ ace3eaf		imitation-learning @ ace3eaf
leaderboard		leaderboard
reinforcement-learning		reinforcement-learning
scenario_runner		scenario_runner
.gitignore		.gitignore
.gitmodules		.gitmodules
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DeFIX

Paper | ArXiv | Code | Slides

Citation

Abstract

Method Overview

Network Architectures

List of CARLA Scenarios

Stuck Vehicle Scenarios in Town05

Benchmark Comparison of Driving Performance Scores in Town05

About

Releases

Packages

Contributors 3

Languages

License

resuldagdanov/DeFIX

Folders and files

Latest commit

History

Repository files navigation

DeFIX

Paper | ArXiv | Code | Slides

Citation

Abstract

Method Overview

Network Architectures

List of CARLA Scenarios

Stuck Vehicle Scenarios in Town05

Benchmark Comparison of Driving Performance Scores in Town05

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages