Forward Propagation, Backward Regression, and Pose Association for Hand Tracking in the Wild

2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022)(2022)

引用 5|浏览10
暂无评分
摘要
We propose HandLer, a novel convolutional architecture that can jointly detect and track hands online in unconstrained videos. HandLer is based on Cascade-RCNN with additional three novel stages. The first stage is Forward Propagation, where the features from frame t -1 are propagated to frame t based on previously detected hands and their estimated motion. The second stage is the Detection and Backward Regression, which uses outputs from the finward propagation to detect hands for frame t and their relative offset in frame t-1. The third stage uses an off-the-shelf human pose method to link any fragmented hand tracklets. We train the forward propagation and backward regression and detection stages end-to-end together with the other Cascade-RCNN components. To train and evaluate HandLer; we also contribute YouTube-Hand, the first challenging large-scale dataset of unconstrained videos annotated with hand locations and their trajectories. Experiments on this dataset and other benchmarks show that HandLer outperforms the existing state-of-the-art tracking algorithms by a large margin. Code and data are available at http://vision.cs.stonybrook.edu/similar to mingzhen/handlerr/.
更多
查看译文
关键词
Motion and tracking
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要