Chrome Extension
WeChat Mini Program
Use on ChatGLM

KAN:Keyframe Attention Network for Person Video Captioning

Xiangyun Zhang,Min Yang, Xu Zhang,Fan Ni,Fangqiang Hu,Aichun Zhu

2023 China Automation Congress (CAC)(2023)

Cited 0|Views9
No score
Abstract
This paper presents a novel algorithm named Keyframe Attention Network (KAN) for video captioning, which combines keyframe feature extraction with an attention allocation mechanism. The proposed method first utilizes a threshold-based keyframe extraction technique to obtain keyframes. Subsequently, keyframe representation module is employed to extract essential features from these keyframes, this module is built by deep residual network. Finally, the extracted feature vectors, along with reference captions, are fed into an attention allocation module to generate descriptive captions. The inclusion of deep residual network ensures an increased network depth without encountering gradient explosions. Moreover, the attention module adopts an Encoder-Decoder structure with additional attention layers, enabling effective attention allocation and yielding more accurate captions.
More
Translated text
Key words
video captioning,keyframe extraction,keyframe representation,attention mechanism
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined