rubricreward

non-profit

https://rubricreward.github.io

rubricreward

Activity Feed

AI & ML interests

R3 Model is all you need

Recent Activity

davidanugraha updated a dataset about 12 hours ago

rubricreward/PolyGuardMix-tgt_prompt_tgt_thinking-filtered_correct

davidanugraha published a dataset about 12 hours ago

rubricreward/PolyGuardMix-tgt_prompt_tgt_thinking-filtered_correct

davidanugraha updated a dataset about 12 hours ago

rubricreward/PolyGuardMix-tgt_prompt_en_thinking-filtered_correct

View all activity

Collections 4

View 4 collections

models 66

datasets 152

rubricreward/PolyGuardMix-tgt_prompt_tgt_thinking-filtered_correct

Viewer • Updated about 12 hours ago • 2.57M • 11

rubricreward/PolyGuardMix-tgt_prompt_en_thinking-filtered_correct

Viewer • Updated about 12 hours ago • 2.62M • 12

rubricreward/PolyGuardMix-en_prompt_en_thinking-filtered_correct

Viewer • Updated about 12 hours ago • 2.63M • 17

rubricreward/PolyGuardMix-tgt_prompt_tgt_thinking

Viewer • Updated about 12 hours ago • 2.88M • 8

rubricreward/PolyGuardMix-tgt_prompt_en_thinking

Viewer • Updated about 13 hours ago • 2.92M • 14

rubricreward/PolyGuardMix-en_prompt_en_thinking

Viewer • Updated about 13 hours ago • 2.92M • 9

rubricreward/HelpSteer3-tgt_prompt_tgt_thinking-filtered_correct

Viewer • Updated about 13 hours ago • 21.1k • 54

rubricreward/HelpSteer3-tgt_prompt_tgt_thinking

Viewer • Updated about 13 hours ago • 38.5k • 40

rubricreward/HelpSteer3-tgt_prompt_en_thinking-filtered_correct

Viewer • Updated 1 day ago • 21.1k • 50

rubricreward/HelpSteer3-en_prompt_en_thinking-filtered_correct

Viewer • Updated 1 day ago • 21.5k • 47

View 152 datasets

rubricreward

AI & ML interests

Recent Activity

Collections 4

R3: Robust Rubric-Agnostic Reward Models

rubricreward/R3-Qwen3-14B-LoRA-4k

rubricreward/R3-Qwen3-14B-4k

rubricreward/R3-Qwen3-14B-14k

rubricreward/R3-Qwen3-8B-LoRA-4k

R3: Robust Rubric-Agnostic Reward Models

rubricreward/R3-Qwen3-14B-LoRA-4k

rubricreward/R3-Qwen3-14B-4k

rubricreward/R3-Qwen3-14B-14k

rubricreward/R3-Qwen3-8B-LoRA-4k

models 66

rubricreward/LLaMA-3.2-3B-DPO-HelpSteer3-R3-Qwen3-14B-LoRA-4k

rubricreward/LLaMA-3.2-3B-DPO-HelpSteer3-R3-Qwen3-8B-14k

rubricreward/LLaMA-3.2-3B-DPO-HelpSteer3-R3-Qwen3-4B-14k

rubricreward/R3-DeepSeek-R1-Distill-Qwen-14B-LoRA-4k

rubricreward/R3-DeepSeek-R1-Distill-Qwen-14B-LoRA-14k

rubricreward/R3-DeepSeek-R1-Distill-Qwen-14B-14k

rubricreward/R3-DeepSeek-R1-Distill-Qwen-14B-4k

rubricreward/R3-Phi-4-reasoning-plus-LoRA-14k

rubricreward/R3-Qwen3-14B-LoRA-14k

rubricreward/R3-Qwen3-8B-LoRA-14k

datasets 152

rubricreward/PolyGuardMix-tgt_prompt_tgt_thinking-filtered_correct

rubricreward/PolyGuardMix-tgt_prompt_en_thinking-filtered_correct

rubricreward/PolyGuardMix-en_prompt_en_thinking-filtered_correct

rubricreward/PolyGuardMix-tgt_prompt_tgt_thinking

rubricreward/PolyGuardMix-tgt_prompt_en_thinking

rubricreward/PolyGuardMix-en_prompt_en_thinking

rubricreward/HelpSteer3-tgt_prompt_tgt_thinking-filtered_correct

rubricreward/HelpSteer3-tgt_prompt_tgt_thinking

rubricreward/HelpSteer3-tgt_prompt_en_thinking-filtered_correct

rubricreward/HelpSteer3-en_prompt_en_thinking-filtered_correct

AI & ML interests

Recent Activity

Team members 7

Collections 4

models 66 Sort: Recently updated

datasets 152 Sort: Recently updated

models 66

datasets 152