Audio-Visual Processing in Meetings: Seven Questions and Some AMI Answers

Machine Learning for Multimodal Interaction(2006)

引用 24|浏览39
暂无评分
摘要
The project Augmented Multi-party Interaction (AMI) is concerned with the development of meeting browsers and remote meet- ing assistants for instrumented meeting rooms - and the required com- ponent technologies R&D themes: group dynamics, audio, visual, and multimodal processing, content abstraction, and human-computer inter- action. The audio-visual processing workpackage within AMI addresses the automatic recognition from audio, video, and combined audio-video streams, that have been recorded during meetings. In this article we describe the progress that has been made in the first two years of the project. We show how the large problem of audio-visual processing in meetings can be split into seven questions, like "Who is acting during the meeting?". We then show which algorithms and methods have been developed and evaluated for the automatic answering of these questions.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要