Self driving car in Duckietown Simulation using Imitation learning

May 6, 2023 · Aishik Pyne · 5 min

Designing a completely autonomous-driving algorithm using Imitation Learning, A-Star Search, and MPC in Duckietown Simulation Environment.

This project was a part of my coursework at NUS, CS5478: Intelligent Robots Spring 2023, taught by Prof David Hsu. A PDF version of the project report can be found here

Problem Statement #

Setup: The Duckietown environment is driving simulator. On a high level it is a grid world where every grid or a tile is either drivable or not-drivable. This can be thought of a satellite map. On a lower level each drivable tile is either a two way road, a left or right turn, a 3 way or a 4 way junction. This can be thought of what a driver sees though the windshield during driving.

This world has right-hand drive, but there are no other cars on the road, and no traffic signs to follow.

Objective: Given a GPS coordinates on an high-level map and a front facing camera feed, the goal is to design a self driving algorithm which can navigate from any starting point to a specified goal tile. Additionally, the algorithm must not deviate from the center of the right lane and reach the goal in the shortest time possible.

Figure 1: Front Facing Camera Feed
Figure 1: Front Facing Camera Feed: Example of navigating on the road.

Figure 2: High Level Map
Figure 2: High Level Map: Example Task on a High Level Map: The white squares are drivable, black squares are not drivable. The blue lines are lines inside the drivable (1) Blue Triangle is the start tile. (2) Red Cross is the goal tile (3) The small square form a path from the start to the goal. Color of the square represents is it has gone forward or taken a left or right turn

Algorithm #

I sectioned off the problem into components. First I will describe the components briefly and later go into the depth of each of the components.

Figure:
Figure: :

  • Alignment Controller: When the episode starts, we have information about the X, Y coordinates of the current tile. However the orientation of the car is unknown. This is a challenge, since later in the algorithm we use high level intentions like left, right and forward to guide the low level controller. These intentions are relative to the current orientation of the car, thus the orientiation must be estimated during run time. Moreover, at the start, the car might spawn at any random orientation, so the best that can be done is to use camera observations to align the robot to the center of the right lane parallel to the edges of the road and then exiting the first time. This is the job of the aligner. Once the car moves of the next time, we have new GPS observation, which can be used to perfectly get the orientation or the car. Thus the control is passed to the high level planner.
  • High Level Planner: In a 2D grid world, the car has 3 Degrees of Freedom (DOF): X, Y, and orientation $\theta$. Given the starting state, and the end tile coordinates, A-Star algorithm can be used to search a high level path and translate it into intentions, e.g. L,L,F,F,F,R,L,R,L. This high level plan is used by the intention conditioned Lane follower to reach the goal tile.
  • Lane Follower: The job of this component is to drive as fast as possible on the center of the right lane, unless it’s a 3 or 4 way junction. At the junction it must use the high-level intention to disambiguate which lane to follow and take the turn towards that lane. This will continue till the goal tile is reached.
  • Localization Module: Once the end tile is reached, the car must recognize a building with a duck on the top and move towards it. The car is designed to perform tank turns till the current observation features match with the learned features of the building.

Aligner #

Analysis of Start tile #

After multiple manual simulations, empirical observations were made about the starting position of the agent.

  • The agent can start in either the right lane or the left lane. It might or might not be facing a lane. In extreme cases, it might not even have any information like yellow lanes, white lanes or red lines, such as looking at grass or at a junction.
  • There is a huge penalty when the agent is driving in the left lane. The objective is to move over to the right lane as quickly as possible without going off the road.

Alignment Controller #

Since the objective is to merge into the right lane, we need no intention. We use a mix of the provided basic lane follower \cite{ProjectBasicLaneFollower} and our trained LaneFollower (described later). For the first few $t<=0.05$ timesteps we use the given lane follower because when it is too close to the edge or it is facing the grass, it does tank turns. After this, our trained lane follower is capable to recover into a lane.

High Level Planner #

The High-Level Planner performs global planning to find the shortest path from the 3rd tile to the goal tile using the A algorithm*. As the heuristic, the Manhattan distance is used, and the cost to come was set to 1 for each neighbour. Note that multiple shortest paths can exist for a particular milestone because of loops in the map.

However, we can play to the Lane Following controller’s strengths to choose from the set of shortest paths. We have observed that the LaneFollowing controller can move faster and collect more rewards when moving forward compared to turns. Moreover, left turns consistently incur less penalty than taking right turns. Thus while expanding the node we add additional cost based on \textit{Intention} used to expand the node: 1 for Forward, 2 for Left, and 4 for Right

Lane Follower #

Localization Module #

Appendix #

Contributions to the project #

  • Lane Follower: Aishik
  • Aligner: Aishik Pyne
  • High Level Planner: Aishik Pyne, Niharika Shrivastava
  • Localization Module: Niharika Shrivastava
  • Report Writing: Aishik Pyne, Niharika Shrivastavas