Skip to content

Codebase of paper "DeFIX: Detecting and Fixing Failure Scenarios with Reinforcement Learning in Imitation Learning Based Autonomous Driving" published at ITSC 2022 πŸ“

License

Notifications You must be signed in to change notification settings

resuldagdanov/DeFIX

Repository files navigation

DeFIX

DeFIX: Detecting and Fixing Failure Scenarios with Reinforcement Learning in Imitation Learning Based Autonomous Driving


Citation

@INPROCEEDINGS{dagdanov2022defix,
    author = {Dagdanov, Resul and Eksen, Feyza and Durmus, Halil and Yurdakul, Ferhat and Ure, Nazim Kemal},
    booktitle = {2022 IEEE 25th International Conference on Intelligent Transportation Systems (ITSC)},
    title = {DeFIX: Detecting and Fixing Failure Scenarios with Reinforcement Learning in Imitation Learning Based Autonomous Driving},
    year = {2022},
    volume={},
    number={},
    pages={4215-4220},
    doi={10.1109/ITSC55140.2022.9922209}
}

Abstract

Safely navigating through an urban environment without violating any traffic rules is a crucial performance target for reliable autonomous driving. In this paper, we present a Reinforcement Learning (RL) based methodology to DEtect and FIX (DeFIX) failures of an Imitation Learning (IL) agent by extracting infraction spots and re-constructing mini-scenarios on these infraction areas to train an RL agent for fixing the shortcomings of the IL approach. DeFIX is a continuous learning framework, where extraction of failure scenarios and training of RL agents are executed in an infinite loop. After each new policy is trained and added to the library of policies, a policy classifier method effectively decides on which policy to activate at each step during the evaluation. It is demonstrated that even with only one RL agent trained on failure scenario of an IL agent, DeFIX method is either competitive or does outperform state-of-the-art IL and RL based autonomous urban driving benchmarks. We trained and validated our approach on the most challenging map (Town05) of CARLA simulator which involves complex, realistic, and adversarial driving scenarios.


Method Overview

Stage-1: An initial dataset is generated by driving a rule-based CARLA autopilot to train an IL agent. This agent is continuously improved with DAgger approach. Stage-2: Mini-scenarios are constructed out of infraction and failure locations by evaluating DAgger cantilevered IL policy. At each continual loop of this stage, one different RL agent is trained on one of these mini-scenarios and added to the library of RL policies. In order to activate an IL agent or one of RL agents during evaluation, the policy classifier network is trained with supervised learning.


Network Architectures

Pre-trained ResNet-50 model is used as an image backbone to all proposed networks. The last layer of a ResNet-50 is reinitialized and trained from scratch during the training of brake and policy classifiers. Speed, orientation, location, and front camera data are obtained from sensors while sequential target locations are given priorly. Target locations are converted to the relative local coordinate frame by using vehicle position and orientation obtained from IMU and GPS sensors, respectively. In the supervised learning network, a 704-dimensional feature vector represents the fused information of sensor inputs, from there it is processed through Fully Connected layers. IL agent only decides when to apply a brake action while policy classifier decides which trained agent to activate during evaluation. ResNet-50 model is completely frozen during RL trainings. The state-space for the RL agent is a 1000-dimensional vector of ResNet backbone output. RL agents output high-level action commands (lane keeping, right and left lane changing, stopping). Low-level steering and throttle actions are determined with lateral and longitudinal PID controllers, respectively.


List of CARLA Scenarios

Scenario ID Scenario Name
0 Dynamic Vehicle Collision
1 Emerging Pedestrian Collision
2 Stuck Vehicle & Static Objects
3 Vehicle Running Red Light
4 Crossing Signalized Traffic Intersections
5 Crossing Un-signalized Intersections

Stuck Vehicle Scenarios in Town05

Illustration of stuck vehicle scenarios in Town05. Blue boxed scene is a training scenario of the RL policy that aims to optimize an overtaking decision. Orange boxed scenes are the examples of evaluation scenarios.


Benchmark Comparison of Driving Performance Scores in Town05

Methods RC - Short DS - Short RC - Long DS - Long
LBC (reference) 55.01 30.97 32.09 7.05
Late Fusion (reference) 83.66 51.56 68.05 31.30
CILRS (reference) 13.40 7.47 7.19 3.68
AIM (reference) 81.07 49.00 60.66 26.50
TransFuser (reference) 78.41 54.52 56.36 33.15
NEAT (reference) 69.34 58.21 88.78 57.49
Geometric Fusion (reference) 86.91 54.32 69.17 25.30
World on Rails (reference) 52.60 38.14 60.57 32.18
DeFIX (ours) 96.34 72.41 89.61 39.42
IL Agent (ours) 90.10 68.47 75.40 33.17
RL Agent (ours) 30.14 24.65 5.17 4.15
Autopilot Agent (reference) 90.94 82.75 75.41 48.60
  • RC - Short : Route Completion Score of an Agent in Town05-Short
  • DS - Short : Driving Score of an Agent in Town05-Short
  • RC - Long : Route Completion Score of an Agent in Town05-Long
  • DS - Short : Driving Score of an Agent in Town05-Long

About

Codebase of paper "DeFIX: Detecting and Fixing Failure Scenarios with Reinforcement Learning in Imitation Learning Based Autonomous Driving" published at ITSC 2022 πŸ“

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published