Pairwise-Covariance Multi-View Discriminant Analysis For Robust Cross-View Human Action Recognition

IEEE ACCESS(2021)

引用 4|浏览1
暂无评分
摘要
Human action recognition (HAR) under different camera viewpoints is the most critical requirement for practical deployment. In this paper, we propose a novel method that leverages successful deep learning-based features for action representation and multi-view analysis to accomplish robust HAR under viewpoint changes. Specifically, we investigate various deep learning techniques, from 2D CNNs to 3D CNNs to capture spatial and temporal characteristics of actions at each separated camera view. A common feature space is then constructed to keep view-invariant features among extracted streams. This is carried out by learning a set of linear transformations that project private features into the common space in which the classes are well distinguished from each other. To this end, we first adopt Multi-view Discriminant Analysis (MvDA). The original MvDA suffers from odd situations in which the most class-discrepant common space could not be found because its objective is overly concentrated on pushing classes from the global mean but unaware of the distance between specific pairs of adjoining classes. We then introduce a pairwise-covariance maximizing extension that takes pairwise distances between classes into account, namely pc-MvDA. The novel method also differs in the way that could be more favorably applied for large high-dimensional multi-view datasets. Extensive experimental results on four datasets (IXMAS, MuHAVi, MICAGes, NTU RGB+D) show that pc-MvDA achieves consistent performance gain, especially for harder classes. The code is publicly available for research purpose at https://github.com/inspiros/pcmvda.
更多
查看译文
关键词
Feature extraction, Training, Three-dimensional displays, Neural networks, Cameras, Deep learning, Correlation, Multi-view analysis, action recognition, deep learning, cross-view recognition
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要