VVA: Video Values Analysis

Yachun Mi, Yan Shu, Honglei Xu, Shaohui Liu,Feng Jiang

PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT VII(2024)

引用 0|浏览5
暂无评分
摘要
User-generated content videos have attracted increasingly attention due to its dominant role in social platforms. It is crucial to analyze values in videos because the extensive range of video content results in significant variations in the subjective quality of videos. However, the research literature on Video Values Analysis (VVA) is very scarce, which aims to evaluate the compatibility between video content and the social mainstream values. Meanwhile, existing video content analysis methods are mainly based on classification techniques, which can not adequate VVA due to their coarse-grained manners. To tackle this challenge, we propose a framework to generate more fine-grained scores for diverse videos, termed as Video Values Analysis Model (VVAM), which consists of a feature extractor based on R3D, a feature aggregation module based on Transformer and a regression head based on MLP. In addition, considered texts in videos can be key clues to improve VVA, we design a new pipeline, termed as Text-Guided Video Values Analysis Model (TG-VVAM), in which texts in videos are spotted by OCR tools and a cross-modal fusion module is used to combine the vision and text features. To further facilitate the VVA, we construct a large-scale dataset, termed as Video Values Analysis Dataset (VVAD), which contains 53,705 short videos of various types from main social platforms. Experiments demonstrate that our proposed VVAM and TG-VVAM achieves promising results in the VVAD.
更多
查看译文
关键词
Video values analysis,Video values analysis model,Text-guided video values analysis model,Video values analysis dataset
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要