The Video Captioning Method Based On The Spatial- Temporal Information and Attention Mechanism
2021 IEEE International Conference on Signal Processing, Communications and Computing (ICSPCC)(2021)
摘要
In order to utilize the complementarity of different level features in different regions of videos effectively, and improve the accuracy of text description in videos, we propose a video captioning method based on the spatiotemporal information and attention mechanism. First, the Faster-RCNN and VGG-16 networks are used to extract the high-level features of the interesting regions and significant ...
更多查看译文
关键词
Measurement,Adaptation models,Conferences,Computational modeling,Signal processing,Feature extraction,Spatiotemporal phenomena
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要