θε¦
commited on
Commit
Β·
2380ed6
1
Parent(s):
91126af
Add design file
Browse files
README.md
CHANGED
@@ -1,100 +1,10 @@
|
|
1 |
-
|
2 |
-
|
3 |
-
|
4 |
-
|
5 |
-
|
6 |
-
|
7 |
-
|
8 |
-
|
9 |
-
|
10 |
-
|
11 |
-
<!-- TOC start (generated with https://github.com/derlin/bitdowntoc) -->
|
12 |
-
|
13 |
-
- [π Overview](#-overview)
|
14 |
-
- [π οΈ TODO List](#-todo-list)
|
15 |
-
- [π Installation](#-installation)
|
16 |
-
- [πΏ Checkpoints](#-checkpoints)
|
17 |
-
- [π― Run a Demo (Point Cloud and Camera Pose Estimation) ](#-run-a-demo-point-cloud-and-camera-pose-estimation)
|
18 |
-
- [π Visualization ](#-visualization)
|
19 |
-
- [π Citation ](#-citation)
|
20 |
-
|
21 |
-
<!-- TOC end -->
|
22 |
-
|
23 |
-
## π Overview
|
24 |
-
We present FLARE, a feed-forward model that simultaneously estimates high-quality camera poses, 3D geometry, and appearance from as few as 2-8 uncalibrated images. Our cascaded learning paradigm:
|
25 |
-
|
26 |
-
1. **Camera Pose Estimation**: Directly regress camera poses without bundle adjustment
|
27 |
-
2. **Geometry Reconstruction**: Decompose geometry reconstruction into two simpler sub-problems
|
28 |
-
3. **Appearance Modeling**: Enable photorealistic novel view synthesis via 3D Gaussians
|
29 |
-
|
30 |
-
Achieves SOTA performance with inference times <0.5 seconds!
|
31 |
-
|
32 |
-
## π οΈ TODO List
|
33 |
-
- [x] Release point cloud and camera pose estimation code.
|
34 |
-
- [x] Updated Gradio demo (app.py).
|
35 |
-
- [ ] Release novel view synthesis code. (~2 weeks)
|
36 |
-
- [ ] Release evaluation code. (~2 weeks)
|
37 |
-
- [ ] Release training code.
|
38 |
-
- [ ] Release data processing code.
|
39 |
-
|
40 |
-
## π Installation
|
41 |
-
|
42 |
-
```
|
43 |
-
conda create -n flare python=3.8
|
44 |
-
conda activate flare
|
45 |
-
conda install pytorch torchvision pytorch-cuda=12.1 -c pytorch -c nvidia # use the correct version of cuda for your system
|
46 |
-
pip install -r requirements.txt
|
47 |
-
conda uninstall ffmpeg
|
48 |
-
conda install -c conda-forge ffmpeg
|
49 |
-
```
|
50 |
-
|
51 |
-
|
52 |
-
## πΏ Checkpoints
|
53 |
-
Download the checkpoint from [huggingface](https://huggingface.co/AntResearch/FLARE/blob/main/geometry_pose.pth) and place it in the /checkpoints/geometry_pose.pth directory.
|
54 |
-
|
55 |
-
## π― Run a Demo (Point Cloud and Camera Pose Estimation)
|
56 |
-
|
57 |
-
|
58 |
-
```bash
|
59 |
-
sh scripts/run_pose_pointcloud.sh
|
60 |
-
```
|
61 |
-
|
62 |
-
|
63 |
-
```bash
|
64 |
-
torchrun --nproc_per_node=1 run_pose_pointcloud.py \
|
65 |
-
--test_dataset "1 @ CustomDataset(split='train', ROOT='Your/Data/Path', resolution=(512,384), seed=1, num_views=7, gt_num_image=0, aug_portrait_or_landscape=False, sequential_input=False)" \
|
66 |
-
--model "AsymmetricMASt3R(pos_embed='RoPE100', patch_embed_cls='ManyAR_PatchEmbed', img_size=(512, 512), head_type='catmlp+dpt', output_mode='pts3d+desc24', depth_mode=('exp', -inf, inf), conf_mode=('exp', 1, inf), enc_embed_dim=1024, enc_depth=24, enc_num_heads=16, dec_embed_dim=768, dec_depth=12, dec_num_heads=12, two_confs=True, desc_conf_mode=('exp', 0, inf))" \
|
67 |
-
--pretrained "Your/Checkpoint/Path" \
|
68 |
-
--test_criterion "MeshOutput(sam=False)" --output_dir "log/" --amp 1 --seed 1 --num_workers 0
|
69 |
-
```
|
70 |
-
|
71 |
-
**To run the demo using ground truth camera poses:**
|
72 |
-
Enable the wpose=True flag in both the CustomDataset and AsymmetricMASt3R. An example script demonstrating this setup is provided in run_pose_pointcloud_wpose.sh.
|
73 |
-
|
74 |
-
```bash
|
75 |
-
sh scripts/run_pose_pointcloud_wpose.sh
|
76 |
-
```
|
77 |
-
|
78 |
-
## π Visualization
|
79 |
-
|
80 |
-
```
|
81 |
-
sh ./visualizer/vis.sh
|
82 |
-
```
|
83 |
-
|
84 |
-
|
85 |
-
```
|
86 |
-
CUDA_VISIBLE_DEVICES=0 python visualizer/run_vis.py --result_npz data/mesh/IMG_1511.HEIC.JPG.JPG/pred.npz --results_folder data/mesh/IMG_1511.HEIC.JPG.JPG/
|
87 |
-
```
|
88 |
-
|
89 |
-
|
90 |
-
## π Citation
|
91 |
-
```bibtex
|
92 |
-
@misc{zhang2025flarefeedforwardgeometryappearance,
|
93 |
-
title={FLARE: Feed-forward Geometry, Appearance and Camera Estimation from Uncalibrated Sparse Views},
|
94 |
-
author={Shangzhan Zhang and Jianyuan Wang and Yinghao Xu and Nan Xue and Christian Rupprecht and Xiaowei Zhou and Yujun Shen and Gordon Wetzstein},
|
95 |
-
year={2025},
|
96 |
-
eprint={2502.12138},
|
97 |
-
archivePrefix={arXiv},
|
98 |
-
primaryClass={cs.CV},
|
99 |
-
url={https://arxiv.org/abs/2502.12138},
|
100 |
-
}
|
|
|
1 |
+
title: FLARE
|
2 |
+
emoji: π₯
|
3 |
+
colorFrom: indigo
|
4 |
+
colorTo: red
|
5 |
+
sdk: gradio
|
6 |
+
sdk_version: 4.38.1
|
7 |
+
app_file: app.py
|
8 |
+
pinned: false
|
9 |
+
models:
|
10 |
+
- zhang3z/FLARE
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|