Deep Neural Networks for Road Scene Perception in Autonomous Vehicles Using LiDARs and Vision Sensors
Accurate object detection in road scenes is one of the most essential requirements of autonomous vehicles. Based on our findings, the existing solutions for autonomous vehicles rely on expensive 64 beams three-dimensional LiDAR (i.e., Light Detection and Ranging) point clouds for positioning the objects in real-world that highly raises the cost of autonomous vehicles and imposes the biggest barrier to their adaption. Considering the limitations of existing solutions, in this thesis, we aim to give accurate three-dimensional object detection using sparse LiDAR point clouds in support with camera images. A unified detection framework is proposed for road scene perception including cars, pedestrians, and cyclists for optimizing the accuracy of 2D object detection based on the existing hardware and datasets. However, for final three-dimensional object detection, this research work is narrowed down to vehicle detection only due to time and resources constraints of academic research. The algorithm does not rely on dense point clouds, rather based on the prime fact that the three-dimensional center of a vehicle is the translation of its two-dimensional center in the form of world coordinates. The model gives satisfactory performance based on 64, 32 and 16 beams density point clouds.