FERGI: Automatic Annotation of User Preferences for Text-to-Image Generation from Spontaneous Facial Expression Reaction
arxiv(2023)
摘要
Researchers have proposed to use data of human preference feedback to
fine-tune text-to-image generative models. However, the scalability of human
feedback collection has been limited by its reliance on manual annotation.
Therefore, we develop and test a method to automatically annotate user
preferences from their spontaneous facial expression reaction to the generated
images. We collect a dataset of Facial Expression Reaction to Generated Images
(FERGI) and show that the activations of multiple facial action units (AUs) are
highly correlated with user evaluations of the generated images. Specifically,
AU4 (brow lowerer) is reflective of negative evaluations of the generated image
whereas AU12 (lip corner puller) is reflective of positive evaluations. These
can be useful in two ways. Firstly, we can automatically annotate user
preferences between image pairs with substantial difference in these AU
responses with an accuracy significantly outperforming state-of-the-art scoring
models. Secondly, directly integrating the AU responses with the scoring models
improves their consistency with human preferences. Finally, this method of
automatic annotation with facial expression analysis can be potentially
generalized to other generation tasks. The code is available at
https://github.com/ShuangquanFeng/FERGI, and the dataset is also available at
the same link for research purposes.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要