Spaces:
Configuration error
Configuration error
File size: 8,107 Bytes
949df4a |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 |
<<<<<<< HEAD
# MV-VTON
PyTorch implementation of **MV-VTON: Multi-View Virtual Try-On with Diffusion Models**
[](https://arxiv.org/abs/2404.17364)
[](https://hywang2002.github.io/MV-VTON/)

[](https://creativecommons.org/licenses/by-nc-sa/4.0/)
## News
- π₯The first multi-view virtual try-on dataset MVG is now available.
- π₯Checkpoints on both frontal-view and multi-view virtual try-on tasks are released.
## Overview

> **Abstract:**
> The goal of image-based virtual try-on is to generate an image of the target person naturally wearing the given
> clothing. However, most existing methods solely focus on the frontal try-on using the frontal clothing. When the views
> of the clothing and person are significantly inconsistent, particularly when the personβs view is non-frontal, the
> results are unsatisfactory. To address this challenge, we introduce Multi-View Virtual Try-ON (MV-VTON), which aims to
> reconstruct the dressing results of a person from multiple views using the given clothes. On the one hand, given that
> single-view clothes provide insufficient information for MV-VTON, we instead employ two images, i.e., the frontal and
> back views of the clothing, to encompass the complete view as much as possible. On the other hand, the diffusion
> models
> that have demonstrated superior abilities are adopted to perform our MV-VTON. In particular, we propose a
> view-adaptive
> selection method where hard-selection and soft-selection are applied to the global and local clothing feature
> extraction, respectively. This ensures that the clothing features are roughly fit to the personβs view. Subsequently,
> we
> suggest a joint attention block to align and fuse clothing features with person features. Additionally, we collect a
> MV-VTON dataset, i.e., Multi-View Garment (MVG), in which each person has multiple photos with diverse views and
> poses.
> Experiments show that the proposed method not only achieves state-of-the-art results on MV-VTON task using our MVG
> dataset, but also has superiority on frontal-view virtual try-on task using VITON-HD and DressCode datasets.
## Getting Started
### Installation
1. Clone the repository
```shell
git clone https://github.com/hywang2002/MV-VTON.git
cd MV-VTON
```
2. Install Python dependencies
```shell
conda env create -f environment.yaml
conda activate mv-vton
```
3. Download the pretrained [vgg](https://drive.google.com/file/d/1rvow8jStPt8t2prDcSRlnf8yzXhrYeGo/view?usp=sharing)
checkpoint and put it in `models/vgg/` for Multi-View VTON and `Frontal-View VTON/models/vgg/` for Frontal-View VTON.
4. Download the pretrained models `mvg.ckpt` via [Baidu Cloud](https://pan.baidu.com/s/17SC8fHE5w2g7gEtzJgRRew?pwd=cshy) or [Google Drive](https://drive.google.com/file/d/1J91PoT8A9yqHWNxkgRe6ZCnDEhN-H9O6/view?usp=sharing),
and `vitonhd.ckpt` via [Baidu Cloud](https://pan.baidu.com/s/1R2yGgm35UwTpnXPEU6-tlA?pwd=cshy) or [Google Drive](https://drive.google.com/file/d/13A0uzUY6PuvitLOqzyHzWASOh0dNXdem/view?usp=sharing), and put `mvg.ckpt` in `checkpoint/` and
put `vitonhd.ckpt`
in `Frontal-View VTON/checkpoint/`.
### Datasets
#### MVG
1. Fill `Dataset Request Form` via [Baidu Cloud](https://pan.baidu.com/s/12HAq0V4FfgpU_q8AeyZzwA?pwd=cshy) or [Google Drive](https://drive.google.com/file/d/1zWt6HYBz7Vzaxu8rp1bwkhRoBkxbwQjw/view?usp=sharing), and
contact `cshy2mvvton@outlook.com` with this form to get MVG dataset (
Non-institutional emails (e.g. gmail.com) are not allowed. Please provide your institutional
email address.).
After these, the folder structure should look like this (the warp_feat_unpair* only included in test directory):
```
βββ MVG
| βββ unpaired.txt
β βββ [train | test]
| | βββ image-wo-bg
β β βββ cloth
β β βββ cloth-mask
β β βββ warp_feat
β β βββ warp_feat_unpair
β β βββ ...
```
#### VITON-HD
1. Download [VITON-HD](https://github.com/shadow2496/VITON-HD) dataset
2. Download pre-warped cloth image/mask via [Baidu Cloud](https://pan.baidu.com/s/1uQM0IOltOmbeqwdOKX5kCw?pwd=cshy) or [Google Drive](https://drive.google.com/file/d/18DTWfhxUnfg41nnwwpCKN--akC4eT9DM/view?usp=sharing) and
put
it under VITON-HD dataset.
After these, the folder structure should look like this (the unpaired-cloth* only included in test directory):
```
βββ VITON-HD
| βββ test_pairs.txt
| βββ train_pairs.txt
β βββ [train | test]
| | βββ image
β β β βββ [000006_00.jpg | 000008_00.jpg | ...]
β β βββ cloth
β β β βββ [000006_00.jpg | 000008_00.jpg | ...]
β β βββ cloth-mask
β β β βββ [000006_00.jpg | 000008_00.jpg | ...]
β β βββ cloth-warp
β β β βββ [000006_00.jpg | 000008_00.jpg | ...]
β β βββ cloth-warp-mask
β β β βββ [000006_00.jpg | 000008_00.jpg | ...]
β β βββ unpaired-cloth-warp
β β β βββ [000006_00.jpg | 000008_00.jpg | ...]
β β βββ unpaired-cloth-warp-mask
β β β βββ [000006_00.jpg | 000008_00.jpg | ...]
```
### Inference
#### MVG
To test on paired settings (using `cp_dataset_mv_paired.py`), you can modify the `configs/viton512.yaml` and `main.py`,
or directly rename `cp_dataset_mv_paired.py` to `cp_dataset.py` (recommended). Then run:
```shell
sh test.sh
```
To test on unpaired settings, rename `cp_dataset_mv_unpaired.py` to `cp_dataset.py`, and do the same operation.
#### VITON-HD
To test on paired settings, input command `cd Frontal-View\ VTON/`, then directly run:
```shell
sh test.sh
```
To test on unpaired settings, input command `cd Frontal-View\ VTON/`, add `--unpaired` to `test.sh`, add then run:
```shell
sh test.sh
```
#### Metrics
We compute `LPIPS`, `SSIM`, `FID`, `KID` using the same tools in [LaDI-VTON](https://github.com/miccunifi/ladi-vton).
### Training
#### MVG
We use Paint-by-Example as initialization, please download the pretrained model
from [Google Drive](https://drive.google.com/file/d/15QzaTWsvZonJcXsNv-ilMRCYaQLhzR_i/view) and save the model to
directory `checkpoints`. Rename `cp_dataset_mv_paired.py` to `cp_dataset.py`, then run:
```shell
sh train.sh
```
#### VITON-HD
Input command `cd Frontal-View\ VTON/`, then directly run:
```shell
sh train.sh
```
## Acknowledgements
Our code is heavily borrowed from [Paint-by-Example](https://github.com/Fantasy-Studio/Paint-by-Example)
and [DCI-VTON](https://github.com/bcmi/DCI-VTON-Virtual-Try-On). We also
thank previous work [PF-AFN](https://github.com/geyuying/PF-AFN), [GP-VTON](https://github.com/xiezhy6/GP-VTON),
[LaDI-VTON](https://github.com/miccunifi/ladi-vton)
and [StableVITON](https://github.com/rlawjdghek/StableVITON).
## LICENSE
MV-VTON: Multi-View Virtual Try-On with Diffusion Models Β© 2024 by Haoyu Wang, Zhilu Zhang, Donglin Di, Shiliang Zhang, Wangmeng Zuo is licensed under CC BY-NC-SA 4.0
## Citation
```
@article{wang2024mv,
title={MV-VTON: Multi-View Virtual Try-On with Diffusion Models},
author={Wang, Haoyu and Zhang, Zhilu and Di, Donglin and Zhang, Shiliang and Zuo, Wangmeng},
journal={arXiv preprint arXiv:2404.17364},
year={2024}
}
```
=======
---
title: Mv Vton Demo
emoji: π
colorFrom: gray
colorTo: pink
sdk: gradio
sdk_version: 5.26.0
app_file: app.py
pinned: false
---
Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
>>>>>>> 2a4541c57faf075fa9e813ae2777dfaa55fc0306
|