Robo_Orchard_Lab
Collection
4 items
•
Updated
•
2
The Architecture Diagram of BIP3D, where the red stars indicate the parts that have been modified or added compared to the base model, GroundingDINO, and dashed lines indicate optional elements.
We made several improvements based on the original paper, achieving better 3D perception results. The main improvements include the following two points:
Model | Inputs | Op | Overall | Head | Common | Tail | Small | Medium | Large | ScanNet | 3RScan | MP3D | ckpt | log |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
BIP3D | RGB | DAG | 16.57 | 23.29 | 13.84 | 12.29 | 2.67 | 17.85 | 12.89 | 19.71 | 26.76 | 8.50 | - | - |
BIP3D | RGB | DAT | 16.67 | 22.41 | 14.19 | 13.18 | 3.32 | 17.25 | 14.89 | 20.80 | 24.18 | 9.91 | - | - |
BIP3D | RGB-D | DAG | 22.53 | 28.89 | 20.51 | 17.83 | 6.95 | 24.21 | 15.46 | 24.77 | 35.29 | 10.34 | - | - |
BIP3D | RGB-D | DAT | 23.24 | 31.51 | 20.20 | 17.62 | 7.31 | 24.09 | 15.82 | 26.35 | 36.29 | 11.44 | - | - |
Model | Inputs | Op | Overall | Easy | Hard | View-dep | View-indep | ScanNet | 3RScan | MP3D | ckpt | log |
---|---|---|---|---|---|---|---|---|---|---|---|---|
BIP3D | RGB | DAG | 44.00 | 44.39 | 39.56 | 46.05 | 42.92 | 48.62 | 42.47 | 36.40 | - | - |
BIP3D | RGB | DAT | 44.43 | 44.74 | 41.02 | 45.17 | 44.04 | 49.70 | 41.81 | 37.28 | - | - |
BIP3D | RGB-D | DAG | 45.79 | 46.22 | 40.91 | 45.93 | 45.71 | 48.94 | 46.61 | 37.36 | - | - |
BIP3D | RGB-D | DAT | 58.47 | 59.02 | 52.23 | 60.20 | 57.56 | 66.63 | 54.79 | 46.72 | - | - |
Model | Inputs | Op | Mixed Data | Overall | Easy | Hard | View-dep | View-indep | ScanNet | 3RScan | MP3D | ckpt | log |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
BIP3D | RGB | DAG | No | 45.81 | 46.21 | 41.34 | 47.07 | 45.09 | 50.40 | 47.53 | 32.97 | - | - |
BIP3D | RGB | DAT | No | 47.29 | 47.82 | 41.42 | 48.58 | 46.56 | 52.74 | 47.85 | 34.60 | - | - |
BIP3D | RGB-D | DAG | No | 53.75 | 53.87 | 52.43 | 55.21 | 52.93 | 60.05 | 54.92 | 38.20 | - | - |
BIP3D | RGB-D | DAT | No | 61.36 | 61.88 | 55.58 | 62.43 | 60.76 | 66.96 | 62.75 | 46.92 | - | - |
BIP3D | RGB-D | DAT | Yes | 66.58 | 66.99 | 62.07 | 67.95 | 65.81 | 72.43 | 68.26 | 51.14 | - | - |
Model | Overall | Easy | Hard | View-dep | View-indep | ckpt | log |
---|---|---|---|---|---|---|---|
EmbodiedScan | 39.67 | 40.52 | 30.24 | 39.05 | 39.94 | - | - |
SAG3D* | 46.92 | 47.72 | 38.03 | 46.31 | 47.18 | - | - |
DenseG* | 59.59 | 60.39 | 50.81 | 60.50 | 59.20 | - | - |
BIP3D | 67.38 | 68.12 | 59.08 | 67.88 | 67.16 | - | - |
BIP3D-Base | 70.53 | 71.22 | 62.91 | 70.69 | 70.47 | - | - |
@article{lin2024bip3d,
title={BIP3D: Bridging 2D Images and 3D Perception for Embodied Intelligence},
author={Lin, Xuewu and Lin, Tianwei and Huang, Lichao and Xie, Hongyu and Su, Zhizhong},
journal={arXiv preprint arXiv:2411.14869},
year={2024}
}