Repository logo
 

SG-DTAM: Joint Staged Generation and Dynamic Time Alignment for Missing and Unaligned Modalities in Sentiment Analysis

aut.relation.articlenumber129750
aut.relation.endpage129750
aut.relation.journalExpert Systems with Applications
aut.relation.startpage129750
dc.contributor.authorHuang, Deling
dc.contributor.authorGao, Ran
dc.contributor.authorZhang, Geng
dc.contributor.authorYu, Jian
dc.date.accessioned2025-09-28T23:39:04Z
dc.date.available2025-09-28T23:39:04Z
dc.date.issued2025-09-18
dc.description.abstractMultimodal Sentiment Analysis (MSA) aims to infer users’ emotional states by integrating information from multiple modalities, such as language, audio, and visual data. However, real-world multimodal data often presents two critical challenges: missing modalities and unaligned multimodal sequences. Missing sources can lead to information loss, while temporal misalignment introduces inconsistencies—both of which significantly degrade analytical accuracy. While a plethora of existing approaches effectively address each challenge in isolation, few can tackle both simultaneously without resorting to complex architectures or incurring substantial computational costs. To overcome these limitations, we propose SG-DTAM, a novel framework that combines staged generation with multi-head dynamic temporal alignment. In the first stage, conditional mutual information is employed to guide a hierarchical series of cross modal attention modules that sequentially reconstruct each missing modality. In the following alignment stage, a set of attention heads with adaptive weighting reconciles temporal discrepancies across all modalities without any reliance on external synchronization labels. Throughout the process, we innovatively introduce a dual supervision objective that combines an InfoNCE based contrastive loss and a reconstruction loss ensures both precise modality synthesis and the development of resilient feature representations. We evaluate SG-DTAM on four benchmark MSA datasets—CMU-MOSI, CMU-MOSEI, IEMOCAP, and MELD. Experimental results demonstrate that our framework achieves competitive or state-of-the-art performance with relatively few learnable parameters. Notably, SG-DTAM exhibits robust performance in scenarios involving both missing and misaligned modalities, underscoring its effectiveness in real-world multimodal sentiment analysis tasks.
dc.identifier.citationExpert Systems with Applications, ISSN: 0957-4174 (Print), Elsevier BV, 129750-129750. doi: 10.1016/j.eswa.2025.129750
dc.identifier.doi10.1016/j.eswa.2025.129750
dc.identifier.issn0957-4174
dc.identifier.urihttp://hdl.handle.net/10292/19873
dc.languageen
dc.publisherElsevier BV
dc.relation.urihttps://www.sciencedirect.com/science/article/pii/S0957417425033652
dc.rights© 2025 Published by Elsevier Ltd. This is a PDF file of an article that has undergone enhancements after acceptance, such as the addition of a cover page and metadata, and formatting for readability, but it is not yet the definitive version of record. This version will undergo additional copyediting, typesetting and review before it is published in its final form, but we are providing this version to give early visibility of the article. Please note that, during the production process, errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
dc.rights.accessrightsOpenAccess
dc.subject46 Information and Computing Sciences
dc.subject4608 Human-Centred Computing
dc.subject4611 Machine Learning
dc.subjectBioengineering
dc.subject01 Mathematical Sciences
dc.subject08 Information and Computing Sciences
dc.subject09 Engineering
dc.subjectArtificial Intelligence & Image Processing
dc.titleSG-DTAM: Joint Staged Generation and Dynamic Time Alignment for Missing and Unaligned Modalities in Sentiment Analysis
dc.typeJournal Article
pubs.elements-id630528

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Huang et al_2025_Joint staged generation.pdf
Size:
18.98 MB
Format:
Adobe Portable Document Format
Description:
Journal article