G-VOILA: Gaze-Facilitated Information Querying in Daily Scenarios
Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies(2024)
摘要
Modern information querying systems are progressively incorporating
multimodal inputs like vision and audio. However, the integration of gaze – a
modality deeply linked to user intent and increasingly accessible via
gaze-tracking wearables – remains underexplored. This paper introduces a novel
gaze-facilitated information querying paradigm, named G-VOILA, which synergizes
users' gaze, visual field, and voice-based natural language queries to
facilitate a more intuitive querying process. In a user-enactment study
involving 21 participants in 3 daily scenarios (p = 21, scene = 3), we revealed
the ambiguity in users' query language and a gaze-voice coordination pattern in
users' natural query behaviors with G-VOILA. Based on the quantitative and
qualitative findings, we developed a design framework for the G-VOILA paradigm,
which effectively integrates the gaze data with the in-situ querying context.
Then we implemented a G-VOILA proof-of-concept using cutting-edge deep learning
techniques. A follow-up user study (p = 16, scene = 2) demonstrates its
effectiveness by achieving both higher objective score and subjective score,
compared to a baseline without gaze data. We further conducted interviews and
provided insights for future gaze-facilitated information querying systems.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要