OST: Refining Text Knowledge with Optimal Spatio-Temporal Descriptor for General Video Recognition
CVPR 2024(2024)
Key words
Video Reognition,Video Understanding,Action Recognition,Multi-modality Video Understanding
AI Read Science
Must-Reading Tree
Example

Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined