ODP-Transformer: Interpretation of pest classification results using image caption generation techniques

Shansong Wang,Qingtian Zeng,Weijian Ni, Cheng Cheng,Yanxue Wang

Comput. Electron. Agric.（2023）

引用 3|浏览10

暂无评分

摘要

Pest image classification systems are key tools to identify pests in time. However, existing image classification systems can only predict the labels of pest images and lack the interpretation of image content. In this paper, image caption generation techniques are introduced to interpret the results of pest image classification. Specifically, we proposed the ODP-Transformer by imitating the three basic actions in the diagnostic process of agricultural experts, which are Observation, Description and Prediction. ODP-Transformer is a two-stage model, the first stage is a pest part detector based on the faster R-CNN framework. And the second stage contains three modules: Parts Sequence Encoder, Description Decoder and Classification Decoder, which are used for image caption generation tasks and classification tasks. At the same time, a prior knowledge matrix is introduced to guide the optimization direction of the attention mechanism in the Description Decoder, which is used to learn the concept correspondences in images and texts. Additionally, an agricultural pest textual and visual dataset (APTV-99) is collected, which contains not only the semantic annotations of images but also the textual descriptions of corresponding parts. Extensive experiments are implemented on APTV-99 to evaluate the performance of ODP-Transformer. In the pest image classification task, ODP-Transformer is 12.91% higher in accuracy than the 8 commonly used CNN models. In the image captioning generation task, compared with the other 6 methods, ODP-Transformer improves by 1.62, 8.08, and 1.08 for Bleu1, CiderD and Rouge indicators, respectively.

查看译文

关键词

Transformer,Image caption,Faster R-CNN,Pest classification

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要