|
--- |
|
license: mit |
|
--- |
|
|
|
<a name="readme-top"></a> |
|
<p align="center"> |
|
<img src="figs/favicon.svg" alt="Logo" width="150"> |
|
<h1 align="center">Evaluating Text Creativity across Diverse Domains:</br>A Dataset and a Large Language Model Evaluator</h1> |
|
</p> |
|
|
|
|
|
<div align="center"> |
|
<a href="https://creval-creative-evaluation.github.io/"><img src="https://img.shields.io/badge/Project%20Page-666?logo=googledocs&logoColor=FFE165&style=for-the-badge" alt="homepage"></a> |
|
<a href="https://arxiv.org/pdf/2505.19236"><img src="https://img.shields.io/badge/arXiv%20paper-666?logo=arxiv&logoColor=FFE165&style=for-the-badge" alt="arXiv"></a> |
|
<br/> |
|
<a href="https://huggingface.co/datasets/Aman/CreataSet"><img src="https://img.shields.io/badge/CreataSet-dataset-blue?logo=databricks&logoColor=white&style=for-the-badge" alt="arXiv"></a> |
|
<a href="https://huggingface.co/Aman/CrEval-7b"><img src="https://img.shields.io/badge/model-7b-purple?logo=huggingface&logoColor=yellow&style=for-the-badge" alt="arXiv"></a> |
|
<a href="https://huggingface.co/Aman/CrEval-14b"><img src="https://img.shields.io/badge/model-14b-purple?logo=huggingface&logoColor=yellow&style=for-the-badge" alt="arXiv"></a> |
|
<a href="https://github.com/Aman-4-Real/CrEval"><img src="https://img.shields.io/badge/github-code-black?logo=github&logoColor=white&style=for-the-badge" alt="arXiv"></a> |
|
<br/> |
|
<hr> |
|
</div> |
|
|
|
|
|
|
|
## π₯ News |
|
|
|
<div class="scrollable"> |
|
<ul> |
|
<li><strong>[2025, Sep 01]</strong>: ππWe release the dataset <a href="https://huggingface.co/datasets/Aman/CreataSet">CreataSet</a> and out creativity evaluation model <a href="https://huggingface.co/Aman/CrEval-7b">CrEval-7b</a> & <a href="https://huggingface.co/Aman/CrEval-14b">CrEval-14b</a>. Feel free to use!</li> |
|
<li><strong>[2025, May 25]</strong>: ππOur <a href="https://arxiv.org/pdf/2505.19236">arXiv paper</a> is available! Check it out for more details.</li> |
|
</ul> |
|
</div> |
|
<span id='table-of-contents'/> |
|
|
|
|
|
## π Brief Intro |
|
|
|
We introduce **CrEval**, the 1st LLM-based evaluator for pairwise creativity evaluation, outperforming GPT-4o by 18.7% in human agreement, and **CreataSet**, a large-scale dataset of over **1M** creative instruction-response pairs across **87** domains. CrEval is a creativity evaluation model based on a pairwise comparison protocol, designed to advance automated evaluation of text creativity. CreataSet can facilitate the meta-evaluation of pairwise comparison models for assessing text creativity. Also, it can be used for training creative generation models. More details please refer to our [paper](https://arxiv.org/abs/2505.19236). |
|
|
|
|
|
|
|
## π€ Quickstart |
|
|
|
You can use our CrEval model via the inference methods provided by [LLaMA-Factory](https://github.com/hiyouga/LLaMA-Factory). |
|
|
|
Please refer to our [GitHub repo](https://github.com/Aman-4-Real/CrEval) for more details. |
|
|
|
|
|
<hr> |
|
|
|
> *We respect and uphold the usage terms of the original data providers. If you believe that any part of this dataset affects your legal rights or raises other concerns, please reach out to us. We will carefully review your request and respond without delay.* |
|
|
|
|
|
|
|
<h2> Please cite our paper if you find our work useful. </h2> |
|
|
|
``` |
|
@article{cao2025evaluating, |
|
title={Evaluating Text Creativity across Diverse Domains: A Dataset and Large Language Model Evaluator}, |
|
author={Cao, Qian and Wang, Xiting and Yuan, Yuzhuo and Liu, Yahui and Luo, Fang and Song, Ruihua}, |
|
journal={arXiv preprint arXiv:2505.19236}, |
|
year={2025} |
|
} |
|
``` |
|
For any questions, please feel free to reach me at caoqian4real@ruc.edu.cn. |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|