A Comprehensive Study on VLAD

NEURAL PROCESSING LETTERS（2021）

引用 1|浏览12

暂无评分

摘要

Recently, the vector of locally aggregated descriptor (VLAD) has shown its great effectiveness in diverse computer vision tasks including image retrieval, Scene classification, and action recognition. Its great success stems from its powerful representation ability and computational efficiency. However, it remains unclear about its theoretical foundation and how it is connected to basic while important algorithms, e.g. , the bag-of-words model and match kernels, and how its performance is affected by parameter configurations, e.g. , normalization and pooling, which are also widely used in state-of-the-art algorithms based on local features. In this paper, with an aim to achieve the full capacity of VLAD, we conduct a comprehensive and in-depth study from both theoretical analysis and experimental practice perspectives. As a theoretical contribution, we provide a new formulation of VLAD via match kernels, which serves to connect VLAD with existing important encoding methods based on local features. As a contribution to the practical use of VLAD, we comprehensively investigate the roles and effects of the two widely-used operations in local feature encoding: normalization and pooling. To the best of our knowledge, our work provides the first comprehensive study on VLAD, which will not only enable a full understanding of it but also provide an important guidance for state-of-the-art algorithms based on local features. We have conducted extensive experiments on three benchmark datasets: Scene-15, Caltech 101 and PPMI for both image classification and action recognition.

查看译文

关键词

Vector of Locally Aggregated Descriptor (VLAD),Kernel,Normalization,Pooling

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要