POS-BERT: Point cloud one-stage BERT pre-training

EXPERT SYSTEMS WITH APPLICATIONS(2024)

引用 1|浏览30
暂无评分
摘要
Recently, the pre-training paradigm combining Transformer and masked language modeling in BERT has achieved tremendous success not only in NLP, but also in images and point clouds. However, directly extending BERT from NLP to point clouds requires first training a discrete Variational AutoEncoder (dVAE) as the tokenizer, which results in a complex two-stage process, as in Point-BERT. Inspired by BERT and MoCo, we propose POS-BERT, a one-stage BERT pre-training method for point clouds. Specifically, we use the masked patch modeling (MPM) task to perform point cloud pre-training, which aims to recover masked patch information under the supervision of a tokenizer's output. Unlike Point-BERT, whose tokenizer is extra-trained and frozen, we propose a momentum tokenizer which is dynamically updated during training the Transformer. Furthermore, in order to better learn high-level semantic representation, we integrate contrastive learning into the proposed framework to maximize the class token consistency between augmented point cloud pairs. Experiments show that POS-BERT achieves the state-of-the-art performance on linear SVM classification of ModelNet40 with fixed feature extractors, and it exceeds Point-BERT by 3.5%. In addition, POS-BERT has significantly improved many downstream tasks, including fine-tuned classification, few-shot classification and part segmentation. The code and trained models will be released on https://github.com/fukexue/POS-BERT.git.
更多
查看译文
关键词
Point cloud pre-training,BERT,Masked patch modeling,Classification
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要