Sketchformer: Transformer-based Representation for Sketched Structure

CVPR(2020)

引用 125|浏览352
暂无评分
摘要
Sketchformer is a novel transformer-based representation for encoding free-hand sketches input in a vector form, i.e. as a sequence of strokes. Sketchformer effectively addresses multiple tasks: sketch classification, sketch based image retrieval (SBIR), and the reconstruction and interpolation of sketches. We report several variants exploring continuous and tokenized input representations, and contrast their performance. Our learned embedding, driven by a dictionary learning tokenization scheme, yields state of the art performance in classification and image retrieval tasks, when compared against baseline representations driven by LSTM sequence to sequence architectures: SketchRNN and derivatives. We show that sketch reconstruction and interpolation are improved significantly by the Sketchformer embedding for complex sketches with longer stroke sequences.
更多
查看译文
关键词
transformer-based representation,free-hand sketch input,sketched structure,stroke sequences,complex sketches,sketch reconstruction,sequence architectures,LSTM sequence,image retrieval tasks,dictionary learning tokenization scheme,learned embedding,tokenized input representations,continuous input representations,sketch based image retrieval,sketch classification,Sketchformer,vector form
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要