The Potential of a Visual Dialogue Agent In a Tandem Automated Audio Description System for Videos

Abigale Stangl,Shasta Ihorn,Yue-Ting Siu, Aditya Bodi, Mar Castanon,Lothar D. Narins,Ilmi Yoon

PROCEEDINGS OF THE 25TH INTERNATIONAL ACM SIGACCESS CONFERENCE ON COMPUTERS AND ACCESSIBILITY, ASSETS 2023（2023）

引用 0|浏览4

暂无评分

摘要

The relentless pace of video production exacerbates the digital accessibility gap that individuals who are blind or low vision (BLV) face on a daily basis, resulting in disproportionate exclusion from community opportunities and risk management. Whereas previous automated audio description (AD) systems provide single-tool approaches for delivering minimum viable description (MVD) or delivering on-demand visual question answering (VQA), we present a tandem AI-based AD tool that combines MVD and on-demand VQA. A user study with 26 BLV individuals explored how the tandem system may be used under the conditions of delivering MVD and/or on-demand VQA with AI-only or human-in-the-loop support. When each tool was used in isolation, AI-only conditions scored significantly lower in both user enjoyment and comprehension. When used in tandem, AI-only conditions matched outcomes delivered with human-in-the-loop, which suggests that AI-only AD tools may be most effective when both types of tools are used in tandem. A multimodal analysis of interactions with the tandem system revealed areas for system improvement in terms of the timing of AD delivery and accurate content delivery. We discuss how the use of both types of tools in a tandem system can mitigate some of the digital frictions that have plagued efforts in machine learning and automated tools for accessibility.

查看译文

关键词

Audio Description,AI,Visual Dialogue,Virtual Agents,Virtual Volunteer,Visual Assistance,Visual Question Answering,Minimum Viable Description,Blind and Low Vision

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要