Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
Cornell-AGI
university
Activity Feed
Follow
9
AI & ML interests
Reinforcement Learning from Human Feedback
Team members
1
Cornell-AGI
's models
20
Sort:Â Recently updated
Cornell-AGI/apo_math_qwen2.5_1.5b
Text Generation
•
2B
•
Updated
May 5
•
5
Cornell-AGI/ppo_math_qwen2.5_1.5b
Text Generation
•
2B
•
Updated
May 5
•
6
Cornell-AGI/rebel_math_qwen2.5_1.5b
Text Generation
•
2B
•
Updated
May 5
•
7
Cornell-AGI/grpo_math_qwen2.5_3b
Text Generation
•
3B
•
Updated
May 5
•
6
Cornell-AGI/grpo_math_qwen2.5_1.5b
Text Generation
•
2B
•
Updated
May 5
•
7
Cornell-AGI/ppo_math_qwen2.5_3b
Text Generation
•
3B
•
Updated
May 5
•
10
Cornell-AGI/rebel_math_qwen2.5_3b
Text Generation
•
3B
•
Updated
May 5
•
6
Cornell-AGI/apo_math_qwen2.5_3b
Text Generation
•
3B
•
Updated
May 5
•
6
Cornell-AGI/grpo_math_qwen2.5_7b
Text Generation
•
8B
•
Updated
May 5
•
6
Cornell-AGI/ppo_math_qwen2.5_7b
Text Generation
•
8B
•
Updated
May 5
•
7
Cornell-AGI/rebel_math_qwen2.5_7b
Text Generation
•
8B
•
Updated
May 4
•
6
Cornell-AGI/apo_math_qwen2.5_7b
Text Generation
•
8B
•
Updated
May 4
•
7
•
1
Cornell-AGI/REFUEL-Llama-3-Armo-iter_2
8B
•
Updated
Oct 8, 2024
•
8
Cornell-AGI/REFUEL-Llama-3-Armo-iter_1
8B
•
Updated
Oct 8, 2024
•
6
Cornell-AGI/REBEL-Llama-3-Armo-iter_3
8B
•
Updated
Sep 2, 2024
•
6
•
2
Cornell-AGI/REBEL-Llama-3-Armo-iter_2
8B
•
Updated
Sep 2, 2024
•
5
•
1
Cornell-AGI/REBEL-Llama-3-Armo-iter_1
8B
•
Updated
Sep 2, 2024
•
5
•
1
Cornell-AGI/REBEL-Llama-3-epoch_2
Text Generation
•
Updated
Sep 1, 2024
•
8
•
3
Cornell-AGI/REBEL-Llama-3
Text Generation
•
Updated
Sep 1, 2024
•
6
•
1
Cornell-AGI/REBEL-OpenChat-3.5
Text Generation
•
Updated
Sep 1, 2024
•
8
•
1