Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
DUAL-GPO
/
phi-2-gpo-v2-i1
like
0
Follow
DUAL Group
2
PEFT
TensorBoard
Safetensors
HuggingFaceH4/ultrafeedback_binarized
phi
alignment-handbook
Generated from Trainer
trl
dpo
custom_code
License:
apache-2.0
Model card
Files
Files and versions
Metrics
Training metrics
Community
Use this model
main
phi-2-gpo-v2-i1
Commit History
End of training
1fba8a5
verified
lole25
commited on
May 11, 2024
Model save
a7a8a9b
verified
lole25
commited on
May 11, 2024
Training in progress, step 900
0366305
verified
lole25
commited on
May 11, 2024
Training in progress, step 800
e0d2230
verified
lole25
commited on
May 11, 2024
Training in progress, step 700
18a4b7a
verified
lole25
commited on
May 11, 2024
Training in progress, step 600
7c7b038
verified
lole25
commited on
May 11, 2024
Training in progress, step 500
0033ca4
verified
lole25
commited on
May 11, 2024
Training in progress, step 400
76aebe2
verified
lole25
commited on
May 11, 2024
Training in progress, step 200
d87a053
verified
lole25
commited on
May 11, 2024
Training in progress, step 100
acc908c
verified
lole25
commited on
May 11, 2024
initial commit
69ebd26
verified
lole25
commited on
May 11, 2024