GrokNet: Unified Computer Vision Model Trunk and Embeddings For Commerce

Sean Bell,Yiqun Liu,Sami Alsheikh,Yina Tang,Ed Pizzi,Michael Henning,Karun Singh,Omkar Parkhi,Fedor Borisyuk

KDD '20: The 26th ACM SIGKDD Conference on Knowledge Discovery and Data Mining Virtual Event CA USA July, 2020（2020）

引用 0|浏览7918

暂无评分

摘要

In this paper, we present GrokNet, a deployed image recognition system for commerce applications. GrokNet leverages a multi-task learning approach to train a single computer vision trunk. We achieve a 2.1x improvement in exact product match accuracy when compared to the previous state-of-the-art Facebook product recognition system. We achieve this by training on 7 datasets across several commerce verticals, using 80 categorical loss functions and 3 embedding losses. We share our experience of combining diverse sources with wide-ranging label semantics and image statistics, including learning from human annotations, user-generated tags, and noisy search engine interaction data. GrokNet has demonstrated gains in production applications and operates at Facebook scale.

查看译文

关键词

Image classification, e-commerce image understanding, multi-task learning, embedding, deep learning

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要