Harmon-1.5B-RecA-plus (Paper Coming Soon)
A self-supervised training framework that aligns understanding and generation in modest compute, with huge zero-shot gain on generation and editing capability.
This repository hosts the model weights for Harmon-1.5B-RecA-plus. For installation, usage instructions, and further documentation, please visit Harmon's original GitHub repository.
π§ Method
Coming soon! Stay tuned~
π Benchmarks
1. Visual Understanding
Remains Unchanged.
2. Text-to-Image Generation
We test it on 1024x1024 resolution.
Model | GenEval β | DPGBench β | WISE β |
---|---|---|---|
Harmon-1.5B | 0.73 | 80.93 | 0.50 |
Harmon-1.5B-RecA-plus | 0.90 | 88.15 | 0.52 |
License
Harmon-1.5B-RecA-plus is licensed under the Apache 2.0 license.
βοΈ Citation
If you find our work inspiring or use our codebase in your research, please consider giving a star β and a citation~
@misc{xie2025reconstructionalignmentimprovesunified, title={Reconstruction Alignment Improves Unified Multimodal Models}, author={Ji Xie and Trevor Darrell and Luke Zettlemoyer and XuDong Wang}, year={2025}, eprint={2509.07295}, archivePrefix={arXiv}, primaryClass={cs.CV}, url={https://arxiv.org/abs/2509.07295}, }
- Downloads last month
- 18
Model tree for sanaka87/Harmon-1.5B-RecA-plus
Base model
wusize/Harmon-1_5B