Vehicle-Related Scene Understanding Using Deep Learning

Liu, Xiaoxu
Yan, Wei Qi
Nguyen, Minh
Kasabov, Nikola
Item type
Degree name
Doctor of Philosophy
Journal Title
Journal ISSN
Volume Title
Auckland University of Technology

Given diverse and intricate nature of traffic scenes, it becomes imperative to comprehend the scene from multiple perspectives and dimensions. In scenes that entail hierarchical relationships and demand a comprehensive grasp of global context, the evaluation of deep learning models hinges on the capacity to handle high-level semantic representation and processing. The models with superior capabilities in understanding hierarchical relationships and excelling in global and local feature extraction are widely regarded as the ideal choices for addressing the challenges of traffic scene understanding.

In this thesis, we undertake a comprehensive exploration of vehicle-related scene understanding using deep learning, from multiple perspectives. Initially, we delve into semantic segmentation and vehicle tracking from a 2D viewpoint. Subsequently, we extend this analysis from 2D to 3D, estimate scene depth and inter-vehicle distances from 2D images for understanding the scene from a different perspective.

To enhance scene analysis, we investigate the fusion of pose and appearance as features. Additionally, we make efforts to improve the understanding of local and global features within the models. This involves restructuring the models through the incorporation of attention modules and Transformer, as well as replacing tracking algorithms and adding distance estimation vector.

Furthermore, this thesis integrates four distinct tasks: Scene segmentation, vehicle tracking, distance estimation, and depth estimation. These integrated approaches yield a more sophisticated and specific scene understanding, encompass not only a horizontal analysis from a 2D perspective but also a vertical understanding from a 3D perspective.

Traffic scene understanding , deep learning , scene segmentation , vehicle tracking , distance estimation , depth estimation , attention module , Transformer
Publisher's version
Rights statement