Improved stixels towards efﬁcient trafﬁc-scene representations
Stixels are medium-level data representations used for the development of computer vision modules for self-driving cars. A stixel is a column of stacked space cubes ranging from the road surface to the visual end of an obstacle. A stixel represents object height at a distance. It supports object detection and recognition regardless of their speciﬁc appearance. Stixel calculations are commonly based on binocular vision; these calculations map millions of pixel disparities into a few hundred stixels. Depending on applied stereo vision, this binocular approach is sometimes incapable to deal with low-textured road information or noisy data. The main objectiveofthisworkistoevaluateandproposeapproachesforcalculatingstixels using different camera conﬁgurations and,possibly,also a LiDAR range sensor. This study also highlights the role of ground manifold modelling for stixel calculations. By using simplifying ground manifold models, calculated stixels may suffer from noise, inconsistency, and false-detection rates for obstacles, especially in challenging datasets. Stixel calculations can be improved with respect to accuracy and robustness by using more adaptive ground manifold approximations. A comparative study of stixel results, obtained for different ground-manifold models, also deﬁnes a main contribution of this thesis. We also consider multi-layer stixel calculations. Comprehensive experiments are performed on two publicly available challenging datasets. We also use a novel way for comparing calculated stixels with ground truth. We compare depth information, as given by extracted stixels, with ground-truth depth, provided by depth measurements using a highly accurate LiDAR range sensor (as available in one of the public datasets). Experimental results also include quantitative evaluations of the trade-off between accuracy and run time. The results show signiﬁcant improvements for particular ways of calculating stixels.