ChatPPG: Multi-Modal Alignment of Large Language Models for Time-Series Forecasting in Table Tennis
| aut.embargo | No | |
| aut.thirdpc.contains | No | |
| dc.contributor.advisor | Yan, Wei Qi | |
| dc.contributor.advisor | Nguyen, Minh | |
| dc.contributor.author | Yang, GuangLiang | |
| dc.date.accessioned | 2025-05-20T20:22:34Z | |
| dc.date.available | 2025-05-20T20:22:34Z | |
| dc.date.issued | 2025 | |
| dc.description.abstract | In this thesis, we explore the adaptation of large language models (LLMs) for structured time-series forecasting, focusing on predicting table tennis serve landing points. Traditional time-series models rely on specialized architectures, while LLMs are inherently designed for textual data processing, posing challenges in numerical sequence modeling. To address this, we introduce ChatPPG, a multi-modal framework that integrates time-series data into LLMs through structured embeddings, cross-modal attention, and parameter-efficient fine-tuning (i.e., LoRA). Our findings demonstrate that alignment-based approaches significantly enhance forecasting accuracy compared to prompting-based methods, with DeepSeek-R1-Distill-Qwen-1.5B achieving the lowest MSE (0.432) and MAE (0.441). However, our study also highlights a trade-off between accuracy and inference efficiency, as prompting-based methods introduce excessive latency, making them impractical for real-time applications. Ablation experiments further validate the importance of multi-modal feature alignment, interleaved embedding fusion (IEF), and domain-informed prompting, showing that their removal leads to substantial performance degradation. In this thesis, we extend the application of foundation models beyond natural language processing, establishing a scalable and computationally efficient framework for integrating LLMs into structured forecasting tasks. Our future research directions include the development of a fully end-to-end multi-modal sports analytics system, leveraging real-time vision models for spatiotemporal reasoning, as well as the exploration of generative models like stable diffusion for stochastic time-series forecasting. These advancements aim to enhance automated match analysis and intelligent coaching applications, further bridging AI, computer vision, and predictive modeling in sports analytics. | |
| dc.identifier.uri | http://hdl.handle.net/10292/19242 | |
| dc.language.iso | en | |
| dc.publisher | Auckland University of Technology | |
| dc.rights.accessrights | OpenAccess | |
| dc.title | ChatPPG: Multi-Modal Alignment of Large Language Models for Time-Series Forecasting in Table Tennis | |
| dc.type | Thesis | |
| thesis.degree.grantor | Auckland University of Technology | |
| thesis.degree.name | Master of Computer and Information Sciences |
