Image Translation and Reconstruction Using a Single Dual Mode Lightweight Encoder

Jose Amendola, Linga Reddy Cenkeramaddi,Ajit Jha

IEEE ACCESS(2024)

引用 0|浏览1
暂无评分
摘要
The richness of textures and semantic information from RGB images can be supplemented in computer vision by the robustness of thermal images to light variations and weather artifacts. While many models rely on inputs from one sensor modality, image translation among modalities can be a solution. The existing works use large models that only work in one translation direction. This cause problems in limited computation applications, as well as a lack of flexibility to work interchangeably for different modalities. Three channel cameras extract visually rich features, but processing them on embedded platforms becomes a bottleneck. Furthermore, edge computing systems impose the burden of compressing data to be sent elsewhere. To address these issues, we propose a novel architecture with a single lightweight encoder capable of working in dual mode, encoding inputs from both grayscale an thermal images into very compact latent vectors. The encoding is then used for cross-modal image translation, grayscale image colorization and thermal image reconstruction, thus allowing 1) different downstream tasks on different modalities, 2) visually rich features from grayscale images and 3) data compression. Four different generators are employed and the training occurs in adversarial fashion with two discriminators. The loss function proposed contains not only adversarial terms but also reconstruction error terms. They induce consistency and contrast preservation across translation and reconstruction. The results backed by evaluation over multiple metrics demonstrate that the model performs the tasks with competitive quality of translation/reconstruction of images with different lighting conditions. Finally, we perform ablation studies to demonstrate the effectiveness of loss terms combined.
更多
查看译文
关键词
Image reconstruction,Gray-scale,Task analysis,Codes,Computer architecture,Feature extraction,Encoding,Generative adversarial networks,Image coding,Edge computing,Data compression,Generative adversarial networks (GAN),image reconstruction,image-to-image (I2I) translation
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要