Head tracker using webcam for auralization

INTER-NOISE and NOISE-CON Congress and Conference Proceedings(2021)

引用 0|浏览2
暂无评分
摘要
Binaural rendering is a technique that seeks to generate virtual auditory environments that replicate the natural listening experience, including the three-dimensional perception of spatialized sound sources. As such, real-time knowledge of the listener’s position, or more specifically, their head and ear orientations allow the transfer of movement from the real world to virtual spaces, which consequently enables a richer immersion and interaction with the virtual scene. This study presents the use of a simple laptop integrated camera (webcam) as a head tracker sensor, disregarding the necessity to mount any hardware to the listener’s head. The software was built on top of a state-of-the-art face landmark detection model, from Google’s MediaPipe library for Python. Manipulations to the coordinate system are performed, in order to translate the origin from the camera to the center of the subject’s head and adequately extract rotation matrices and Euler angles. Low-latency communication is enabled via User Datagram Protocol (UDP), allowing the head tracker to run in parallel and asynchronous with the main application. Empirical experiments have demonstrated reasonable accuracy and quick response, indicating suitability to real-time applications that do not necessarily require methodical precision. Furthermore, cross-validation with existing hardware head trackers revealed an adequate agreement on measured head orientation, confirming its potential as a contactless head tracking device.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要