Abstract
In this work, we propose adaptive driving behaviors for simulated cars using continuous control deep reinforcement learning. Deep Deterministic Policy Gradient(DDPG) is known to give smooth driving maneuvers in simulated environments. Unfortunately, simple feedforward networks, lack the capability to contain temporal information, hence we have used its Recurrent variant called Recurrent Deterministic Policy Gradients. Our trained agent adapts itself to the velocity of the traffic. It is capable of slowing down in the presence of dense traffic, to prevent collisions as well to speed up and change lanes in order to overtake when the traffic is sparse. The reasons for the above behavior, as well as, our main contributions are: 1. Application of Recurrent Deterministic Policy Gradients. 2. Novel reward function formulation. 3. Modified Replay Buffer called Near and Far Replay Buffers, wherein we maintain two replay buffers and sample equally from both of them.