Harmon-1.5B-RecA-plus (Paper Coming Soon)

A self-supervised training framework that aligns understanding and generation in modest compute, with huge zero-shot gain on generation and editing capability.

This repository hosts the model weights for Harmon-1.5B-RecA-plus. For installation, usage instructions, and further documentation, please visit Harmon's original GitHub repository.

🧠 Method

Coming soon! Stay tuned~

📊 Benchmarks

1. Visual Understanding

Remains Unchanged.

2. Text-to-Image Generation

We test it on 1024x1024 resolution.

Model	GenEval ↑	DPGBench ↑	WISE ↑
Harmon-1.5B	0.73	80.93	0.50
Harmon-1.5B-RecA-plus	0.90	88.15	0.52

License

Harmon-1.5B-RecA-plus is licensed under the Apache 2.0 license.

✍️ Citation

If you find our work inspiring or use our codebase in your research, please consider giving a star ⭐ and a citation~

@misc{xie2025reconstructionalignmentimprovesunified, title={Reconstruction Alignment Improves Unified Multimodal Models}, author={Ji Xie and Trevor Darrell and Luke Zettlemoyer and XuDong Wang}, year={2025}, eprint={2509.07295}, archivePrefix={arXiv}, primaryClass={cs.CV}, url={https://arxiv.org/abs/2509.07295}, }

sanaka87
/

Harmon-1.5B-RecA-plus

Harmon-1.5B-RecA-plus (Paper Coming Soon)

🧠 Method

📊 Benchmarks

1. Visual Understanding

2. Text-to-Image Generation

License

✍️ Citation

Model tree for sanaka87/Harmon-1.5B-RecA-plus

Datasets used to train sanaka87/Harmon-1.5B-RecA-plus

Collection including sanaka87/Harmon-1.5B-RecA-plus

RecA