CVPR 2023
TL;DR: PLA leverages powerful VL foundation models to construct hierarchical 3D-text pairs for 3D open-world learning.
![]() |
![]() |
![]() |
| working space | piano | vending machine |
- Release caption processing code
Please refer to INSTALL.md for the installation.
Please refer to DATASET.md for dataset preparation.
Please refer to MODEL.md for training and inference scripts and pretrained models.
If you find this project useful in your research, please consider cite:
@inproceedings{ding2022language,
title={PLA: Language-Driven Open-Vocabulary 3D Scene Understanding},
author={Ding, Runyu and Yang, Jihan and Xue, Chuhui and Zhang, Wenqing and Bai, Song and Qi, Xiaojuan},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
year={2023}
}Code is partly borrowed from OpenPCDet, PointGroup and SoftGroup.


