Huan, YanYan, Wei Qi2025-01-292025-01-292025-01-12Electronics, ISSN: 2079-9292 (Online), MDPI AG, 14(2), 286-286. doi: 10.3390/electronics140202862079-9292http://hdl.handle.net/10292/18531This study explored the application of deep learning models for signal flag recognition, comparing YOLO11 with basic CNN, ResNet18, and DenseNet121. Experimental results demonstrated that YOLO11 outperformed the other models, achieving superior performance across all common evaluation metrics. The confusion matrix further confirmed that YOLO11 exhibited the highest classification accuracy among the tested models. Moreover, by integrating MediaPipe’s human posture data with image data to create multimodal inputs for training, it was observed that the posture data significantly enhanced the model’s performance. Leveraging MediaPipe’s posture data for annotation generation and model training enabled YOLO11 to achieve an impressive 99% accuracy on the test set. This study highlights the effectiveness of YOLO11 for flag signal recognition tasks. Furthermore, it demonstrates that when handling tasks involving human posture, MediaPipe not only enhances model performance through posture feature data but also facilitates data processing and contributes to validating prediction results in subsequent stages.© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).https://creativecommons.org/licenses/by/4.0/0906 Electrical and Electronic Engineering4009 Electronics, sensors and digital hardwareSemaphore Recognition Using Deep LearningJournal ArticleOpenAccess10.3390/electronics14020286