ASYv3: Attention-enabled pooling embedded Swin transformer-based YOLOv3 for obscenity detection

EXPERT SYSTEMS(2023)

引用 0|浏览10
暂无评分
摘要
The rampant spread of explicit content across social media can leave a damaging mark on our society. Hence, the need to be vigilant in detecting and curtailing sexually explicit content cannot be overstated. As such, it becomes paramount to discern and manage sexually explicit material to curb its dissemination and safeguard our digital communities from its harmful effects. In this article, we propose a unique technique entitled attention-enabled pooling (ABP) embedded Swin transformer-based YOLOv3 (ASYv3) for the detection of obscene areas present in the images with a bounding box around the offensive regions. ASYv3 employs a unique two-step approach for enhanced performance in obscene detection. In the first step, a scalable and efficient Swin transformer block is integrated, utilizing self-attention and model parallelism to train massive models effectively. In the second phase, the embedding layer of the Swin transformer is replaced with ABP, mitigating disruption of feature context. ABP allows for the projection of raw-valued features into linear form with proper attention to feature context information at specified locations, resulting in optimized feature extraction. The proposed ABP embedded Swin transformer-based YOLOv3 (ASYv3) was trained with annotated obscene images (AOI) dataset. The proposed ASYv3 model surpassed the state-of-the-art methods by achieving 97% testing accuracy, 96.62% precision, 97.40% sensitivity, 3.48% FPR rate, 97.37% NPV values, and 95.59% mAP values, respectively.
更多
查看译文
关键词
attention-based pooling, obscene detection, Swin transformer, YOLOv3
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要