Memory-guided Representation Matching for Unsupervised Video Anomaly Detection
Journal of visual communication and image representation(2024)
Abstract
Recent works on Video Anomaly Detection (VAD) have made advancements in the unsupervised setting, known as Unsupervised VAD (UVAD), which brings it closer to practical applications. Unlike the classic VAD task that requires a clean training set with only normal events, UVAD aims to identify abnormal frames without any labeled normal/abnormal training data. Many existing UVAD methods employ handcrafted surrogate tasks, such as frame reconstruction, to address this challenge. However, we argue that these surrogate tasks are sub-optimal solutions, inconsistent with the essence of anomaly detection. In this paper, we propose a novel approach for UVAD that directly detects anomalies based on similarities between events in videos. Our method generates representations for events while simultaneously capturing prototypical normality patterns, and detects anomalies based on whether an event’s representation matches the captured patterns. The proposed model comprises a memory module to capture normality patterns, and a representation learning network to obtain representations matching the memory module for normal events. A pseudo-label generation module as well as an anomalous event generation module for negative learning are further designed to assist the model to work under the strictly unsupervised setting. Experimental results demonstrate that the proposed method outperforms existing UVAD methods and achieves competitive performance compared with classic VAD methods.
MoreTranslated text
Key words
Video anomaly detection,Video understanding,Representation learning
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined