SARM: Synthetic Data Annotation for Enhancing the Experiences of Augmented Reality Application Based on Machine Learning
Background and Objective: Augmented Reality is one of the fastest-growing fields, increasing funding for the last few years, as people realise the potential benefits of rendering virtual information in the real world. As the equipment gets more commercialised, the cost would get lowered while the performance also goes up. However, most of today’s Augmented Reality marker-based applications would use local features detection and tracking techniques. The disadvantages of applying these techniques are that the markers must be modified to match the unique classified algorithms or suffer from lower detection accuracy. Machine learning is a perfect solution to overcome the current drawbacks of image processing in Augmented Reality applications.
Methods: This thesis is split into two investigation directions. The first investigation is to implement new Augmented Reality markers with concealed information such as bar-code or quick response code while keeping most of the visual information of the original texture. The second investigation demonstrates the Augmented Reality marker without using any embedded codes and original texture modification required by immersing the machine learning technology into the marker detection process. The new approach incorporated Machine Learning using deep neural networks to detect and track the Augmented Reality application’s marker targets. The research implemented the auto-generated dataset tool, which uses for the Machine Learning dataset preparation step. The final iOS prototype application was developed to incorporate object detection, object tracking and Augmented Reality. The Machine Learning model was taught to recognise the differences between targets using YOLO’s most famous object detection methods. The model was trained by either Pytorch, and the final product uses a valuable toolkit for developing the Augmented Reality application called ARKit.
Results: Several different experimental exercises have been conducted to qualify the proposed methods on technical performances. The experimental outcomes indicated that the object detection model could achieve over 80% precision, over 90% recall, and over 70% mean average precision using proposed synthetic datasets. The proposed method significantly improves object detection accuracy where it could achieve at least 18% higher than the real-world dataset. The iOS prototype can detect the target markers and display the augmented objects under different lighting conditions at an average rate of 50 frames per second.