Repository logo
 

LarTap: A Luminance-Aware Framework With Text-Correlation Priors for Multi-Exposure Image Fusion

Supervisor

Item type

Journal Article

Degree name

Journal Title

Journal ISSN

Volume Title

Publisher

Institute of Electrical and Electronics Engineers (IEEE)

Abstract

Conventional imaging devices often struggle to produce high-dynamic-range (HDR) images that accurately represent natural scenes. To overcome this limitation, multi-exposure image fusion (MEF) techniques have been introduced as a viable solution. Existing MEF approaches aim to enhance performance by optimizing or searching architectures. However, they face challenges in precise feature extraction and scene reconstruction, leading to distortion in the fused images. Additionally, most methods do not adequately address luminance variations across different image regions, which may result in the loss of essential details. To address these challenges, we present a novel luminance-aware MEF framework that integrates text-correlation priors (LarTap). By embedding textual information into fusion process, the proposed framework enhances content extraction and comprehension. Specifically, it consist of two key components: the text-image correlation network (N1) and the multi-exposure fusion network (N2). First, N1 performs correlation training to achieve a holistic alignment between text and image pairs. Its iterative vision encoders (VEs) generate text-correlated prior knowledge to facilitate the fusion process in N2. Second, N2 leverages these priors for scene reconstruction and dynamically adjusts luminance based on comparative perception. Extensive experiments on multiple datasets demonstrate that LarTap outperforms state-of-the-art methods.

Description

Source

IEEE Transactions on Circuits and Systems for Video Technology, ISSN: 1051-8215 (Print); 1558-2205 (Online), Institute of Electrical and Electronics Engineers (IEEE), PP(99), 1-1. doi: 10.1109/tcsvt.2025.3562564

Rights statement

This article has been accepted for publication in IEEE Transactions on Circuits and Systems for Video Technology. This is the author's version which has not been fully edited and content may change prior to final publication.