Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
gCao
/
mistral-7b-dpo-arena
like
1
PEFT
Safetensors
lmarena-ai/arena-human-preference-55k
fine-tuning
dpo
arena-dataset
lora
rlhf
License:
apache-2.0
Model card
Files
Files and versions
xet
Community
Use this model
main
mistral-7b-dpo-arena
Commit History
Add model card
3d854b9
verified
gCao
commited on
Jun 29
Add DPO model trained on Arena dataset
4cebb7d
verified
gCao
commited on
Jun 29
initial commit
cae87e5
verified
gCao
commited on
Jun 29