Timestamp-supervised action segmentation from the perspective of clustering

arXiv (Cornell University)(2023)

Cited 0|Views35
No score
Abstract
Video action segmentation under timestamp supervision has recently received much attention due to lower annotation costs. Most existing methods generate pseudo-labels for all frames in each video to train the segmentation model. However, these methods suffer from incorrect pseudo-labels, especially for the semantically unclear frames in the transition region between two consecutive actions, which we call ambiguous intervals . To address this issue, we propose a novel framework from the perspective of clustering, which includes the following two parts. First, pseudo-label ensembling generates incomplete but high-quality pseudolabel sequences, where the frames in ambiguous intervals have no pseudo-labels. Second, iterative clustering iteratively propagates the pseudo-labels to the ambiguous intervals by clustering, and thus updates the pseudo-label sequences to train the model. We further introduce a clustering loss, which encourages the features of frames within the same action segment more compact. Extensive experiments show the effectiveness of our method.
More
Translated text
Key words
action,segmentation,timestamp-supervised
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined