Large Kernel Attention Hashing for Efficient Image Retrieval

Xinxin Zhao,Zhuang Miao, Yufei Wang,Jiabao Wang,Yang Li

2022 14th International Conference on Wireless Communications and Signal Processing (WCSP)（2022）

引用 1|浏览22

暂无评分

摘要

Since the Vision Ttransformer can capture long-range dependencies in images, it has been successfully applied in hashing image retrieval. Unfortunately, the existing Transformer-based hashing methods suffer from the problem of dividing 2D images into ID patch sequences, which ignores the 2D structure of patches and destroys local feature details. Indeed, the Conv Stem in Convolutional Neural Network can avoid the above-mentioned problem. However, the convolution operation is expert in extracting local contextual information of the image, which is difficult to capture the global representations in contrast to Transformer. Therefore, this paper proposes a hybrid hash network structure, termed Large Kernel Attention Hashing method (LKAH). It cleverly introduces a large convolution kernel and an attention mechanism into the design of the hash network, so that the hash network can simultaneously extract local features and long-range dependencies of images under a certain number of parameters. While fully combining the advantages of convolution and Transformer, LKAH also conquers the defects of both, and achieves state-of-the-art retrieval performance on two commonly used datasets, CIFAR-10 and NUS-WIDE.

查看译文

关键词

vision transformer,convolutional neural network,hashing,image retrieval,hybrid hashing network

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要