StyleBabel: Artistic Style Tagging and Captioning.

Dan Ruta,Andrew Gilbert,Pranav Aggarwal,Naveen Marri,Ajinkya Kale,Jo Briggs,Chris Speed,Hailin Jin,Baldo Faieta,Alex Filipkowski,Zhe Lin,John Collomosse

European Conference on Computer Vision（2022）

引用 8|浏览38

暂无评分

摘要

We present StyleBabel, a unique open access dataset of natural language captions and free-form tags describing the artistic style of over 135K digital artworks, collected via a novel participatory method from experts studying at specialist art and design schools. StyleBabel was collected via an iterative method, inspired by ‘Grounded Theory’: a qualitative approach that enables annotation while co-evolving a shared language for fine-grained artistic style attribute description. We demonstrate several downstream tasks for StyleBabel, adapting the recent ALADIN architecture for fine-grained style similarity, to train cross-modal embeddings for: 1) free-form tag generation; 2) natural language description of artistic style; 3) fine-grained text search of style. To do so, we extend ALADIN with recent advances in Visual Transformer (ViT) and cross-modal representation learning, achieving a state of the art accuracy in fine-grained style retrieval.

查看译文

关键词

Datasets and evaluation,Image and video retrieval,Vision + language,Vision applications and systems

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要