We Know Where They Are Looking at From the RGB-D Camera: Gaze Following in 3D

Zhengxi Hu, Dingye Yang, Shilei Cheng,Lei Zhou,Shichao Wu,Jingtai Liu

IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT（2022）

引用 10|浏览2

暂无评分

摘要

Inferring gaze target or gaze following is an effective way to understand human actions and intentions, which makes quite a challenge. Some existing studies on gaze estimation cannot accurately locate the gaze target in a 3D scene by gaze direction alone, while other studies on gaze following have failed to exploit the contexts in the 3D scene. In this article, we make full use of the information obtained by the RGB-D camera and innovatively expand the gaze target estimation from 2D image to 3D space through the predicted 3D gaze vector. Specifically, we rebuild a new 3D gaze-following dataset, RGB-D Attention dataset, which contains 3D real-world gaze behaviors. In addition, we extend the depth information for the GazeFollow dataset to utilize its diverse scene information in the training process of 3D gaze following. Then, considering the gaze direction as a crucial clue, we propose a novel gaze vector space containing 3D information and a 3D gaze pathway for learning the gaze behavior in the 3D scene. After two-stage training, the entire model can output the predicted 3D gaze vector and the predicted gaze heatmap, which are used to estimate the 3D gaze target in the inference algorithm. Experiments in the 3D scenes show that our method can reduce the predicted average distance error to 0.307 m and the predicted average angle error to 19.8 degrees. Compared with the state-of-the-art gaze inference method, our proposed method has reduced the prediction error by more than 45%. Our web page is at https://sites.google.com/view/3dgazefollow.

查看译文

关键词

Three-dimensional displays,Estimation,Cameras,Training,Robots,Robot kinematics,Annotations,3D gaze following,3D gaze pathway,gaze vector space,RGB-D Attention dataset,RGB-D camera

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要