Grounded Parsing of Object Attributes and Prepositions CS 288 Final Project
google(2010)
摘要
High-level computer vision and natural language processing are thoroughly intertwined, with the potential to jointly improve performance. We propose a well-defined subset of this underexplored overlap of problems, centered around improving grounded parsing of text and object recognition in images for related pairs of images and text descriptions. We gather a new dataset and present a parsing algorithm to extract object attributes and relations from natural descriptions of images. Using ground truth data, we evaluate our performance and visualize object co-occurences and prepositions using an annotated set of images. Our results are highly encouraging, and inform our suggestions for further work.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要