聂如 commited on
Commit
2380ed6
Β·
1 Parent(s): 91126af

Add design file

Browse files
Files changed (1) hide show
  1. README.md +10 -100
README.md CHANGED
@@ -1,100 +1,10 @@
1
- # FLARE: Feed-forward Geometry, Appearance and Camera Estimation from Uncalibrated Sparse Views
2
- [![Website](https://img.shields.io/website-up-down-green-red/http/shields.io.svg)](https://zhanghe3z.github.io/FLARE/)
3
- [![Hugging Face](https://img.shields.io/badge/%F0%9F%A4%97-Hugging%20Face-yellow)](https://huggingface.co/AntResearch/FLARE)
4
- [![Video](https://img.shields.io/badge/Video-Demo-red)](https://zhanghe3z.github.io/FLARE/videos/teaser_video.mp4)
5
-
6
- Official implementation of **FLARE** (CVPR 2025) - a feed-forward model for joint camera pose estimation, 3D reconstruction and novel view synthesis from sparse uncalibrated views.
7
-
8
- ![Teaser Video](./assets/teaser.jpg)
9
-
10
-
11
- <!-- TOC start (generated with https://github.com/derlin/bitdowntoc) -->
12
-
13
- - [πŸ“– Overview](#-overview)
14
- - [πŸ› οΈ TODO List](#-todo-list)
15
- - [🌍 Installation](#-installation)
16
- - [πŸ’Ώ Checkpoints](#-checkpoints)
17
- - [🎯 Run a Demo (Point Cloud and Camera Pose Estimation) ](#-run-a-demo-point-cloud-and-camera-pose-estimation)
18
- - [πŸ‘€ Visualization ](#-visualization)
19
- - [πŸ“œ Citation ](#-citation)
20
-
21
- <!-- TOC end -->
22
-
23
- ## πŸ“– Overview
24
- We present FLARE, a feed-forward model that simultaneously estimates high-quality camera poses, 3D geometry, and appearance from as few as 2-8 uncalibrated images. Our cascaded learning paradigm:
25
-
26
- 1. **Camera Pose Estimation**: Directly regress camera poses without bundle adjustment
27
- 2. **Geometry Reconstruction**: Decompose geometry reconstruction into two simpler sub-problems
28
- 3. **Appearance Modeling**: Enable photorealistic novel view synthesis via 3D Gaussians
29
-
30
- Achieves SOTA performance with inference times <0.5 seconds!
31
-
32
- ## πŸ› οΈ TODO List
33
- - [x] Release point cloud and camera pose estimation code.
34
- - [x] Updated Gradio demo (app.py).
35
- - [ ] Release novel view synthesis code. (~2 weeks)
36
- - [ ] Release evaluation code. (~2 weeks)
37
- - [ ] Release training code.
38
- - [ ] Release data processing code.
39
-
40
- ## 🌍 Installation
41
-
42
- ```
43
- conda create -n flare python=3.8
44
- conda activate flare
45
- conda install pytorch torchvision pytorch-cuda=12.1 -c pytorch -c nvidia # use the correct version of cuda for your system
46
- pip install -r requirements.txt
47
- conda uninstall ffmpeg
48
- conda install -c conda-forge ffmpeg
49
- ```
50
-
51
-
52
- ## πŸ’Ώ Checkpoints
53
- Download the checkpoint from [huggingface](https://huggingface.co/AntResearch/FLARE/blob/main/geometry_pose.pth) and place it in the /checkpoints/geometry_pose.pth directory.
54
-
55
- ## 🎯 Run a Demo (Point Cloud and Camera Pose Estimation)
56
-
57
-
58
- ```bash
59
- sh scripts/run_pose_pointcloud.sh
60
- ```
61
-
62
-
63
- ```bash
64
- torchrun --nproc_per_node=1 run_pose_pointcloud.py \
65
- --test_dataset "1 @ CustomDataset(split='train', ROOT='Your/Data/Path', resolution=(512,384), seed=1, num_views=7, gt_num_image=0, aug_portrait_or_landscape=False, sequential_input=False)" \
66
- --model "AsymmetricMASt3R(pos_embed='RoPE100', patch_embed_cls='ManyAR_PatchEmbed', img_size=(512, 512), head_type='catmlp+dpt', output_mode='pts3d+desc24', depth_mode=('exp', -inf, inf), conf_mode=('exp', 1, inf), enc_embed_dim=1024, enc_depth=24, enc_num_heads=16, dec_embed_dim=768, dec_depth=12, dec_num_heads=12, two_confs=True, desc_conf_mode=('exp', 0, inf))" \
67
- --pretrained "Your/Checkpoint/Path" \
68
- --test_criterion "MeshOutput(sam=False)" --output_dir "log/" --amp 1 --seed 1 --num_workers 0
69
- ```
70
-
71
- **To run the demo using ground truth camera poses:**
72
- Enable the wpose=True flag in both the CustomDataset and AsymmetricMASt3R. An example script demonstrating this setup is provided in run_pose_pointcloud_wpose.sh.
73
-
74
- ```bash
75
- sh scripts/run_pose_pointcloud_wpose.sh
76
- ```
77
-
78
- ## πŸ‘€ Visualization
79
-
80
- ```
81
- sh ./visualizer/vis.sh
82
- ```
83
-
84
-
85
- ```
86
- CUDA_VISIBLE_DEVICES=0 python visualizer/run_vis.py --result_npz data/mesh/IMG_1511.HEIC.JPG.JPG/pred.npz --results_folder data/mesh/IMG_1511.HEIC.JPG.JPG/
87
- ```
88
-
89
-
90
- ## πŸ“œ Citation
91
- ```bibtex
92
- @misc{zhang2025flarefeedforwardgeometryappearance,
93
- title={FLARE: Feed-forward Geometry, Appearance and Camera Estimation from Uncalibrated Sparse Views},
94
- author={Shangzhan Zhang and Jianyuan Wang and Yinghao Xu and Nan Xue and Christian Rupprecht and Xiaowei Zhou and Yujun Shen and Gordon Wetzstein},
95
- year={2025},
96
- eprint={2502.12138},
97
- archivePrefix={arXiv},
98
- primaryClass={cs.CV},
99
- url={https://arxiv.org/abs/2502.12138},
100
- }
 
1
+ title: FLARE
2
+ emoji: πŸ”₯
3
+ colorFrom: indigo
4
+ colorTo: red
5
+ sdk: gradio
6
+ sdk_version: 4.38.1
7
+ app_file: app.py
8
+ pinned: false
9
+ models:
10
+ - zhang3z/FLARE