Swin Transformer Based Pyramid Pooling Network for Food Segmentation

Qiankun Wang,Xiaoxiao Dong, Ruimin Wang, Hao Sun

2022 IEEE 2nd International Conference on Software Engineering and Artificial Intelligence (SEAI)(2022)

引用 1|浏览12
暂无评分
摘要
Food segmentation is critical to human health and is one of the elements of food computing that provides the basis for nutritional assessment as well as composition testing. Food image segmentation differs from general images in that it usually does not exhibit a unique spatial layout and common semantic patterns. Current food segmentation methods mainly utilize deep visual features of convolutional neural networks(CNN) to achieve image segmentation of food, which ignore the characteristics of food images and make it difficult to achieve the best segmentation performance. In this paper, we propose a Swin Transformer-based pyramid network to capture richer background and boundary information and adaptively combine local features with global features to solve the food image segmentation task. The pyramid pooling module(PPM) aggregates contextual information from different regions of the food image, thus improving the feature representation of global information. Secondly, the multi-scale features acquired by the PPM module are constructed into a feature pyramid, and the multi-scale features are weighted, and then richer edge information is extracted. Experiments are conducted on the FoodSeg103 dataset, and the results show that the method has better results compared with the traditional method, maximizing the details of edges and veins with significant improvements.
更多
查看译文
关键词
food image segmentation,swin transformer,pyramid network,multiple perceptrons
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要