zhangchenxu/RB-Qwen2.5-VL-3B-Instruct-vlr_syn_filtered_10k_exp6_hybrid_nothink-GRPO_step312 4B • Updated 8 days ago • 14
zhangchenxu/RB-Qwen2.5-VL-3B-Instruct-vlr_syn_filtered_10k_exp6_hybrid_nothink-GRPO_step312 4B • Updated 8 days ago • 14
zhangchenxu/RB-Qwen2.5-VL-3B-Instruct-vlr_syn_filtered_10k_exp6_hybrid_nothink-GRPO_step288 4B • Updated 8 days ago • 13
zhangchenxu/RB-Qwen2.5-VL-3B-Instruct-vlr_syn_filtered_10k_exp6_hybrid_nothink-GRPO_step288 4B • Updated 8 days ago • 13
zhangchenxu/RB-Qwen2.5-VL-3B-Instruct-vlr_syn_filtered_10k_exp6_hybrid_nothink-GRPO_step256 4B • Updated 8 days ago • 70
zhangchenxu/RB-Qwen2.5-VL-3B-Instruct-vlr_syn_filtered_10k_exp6_hybrid_nothink-GRPO_step256 4B • Updated 8 days ago • 70
zhangchenxu/RB-Qwen2.5-VL-3B-Instruct-vlr_syn_filtered_10k_exp6_hybrid_nothink-GRPO_step224 4B • Updated 8 days ago • 14
zhangchenxu/RB-Qwen2.5-VL-3B-Instruct-vlr_syn_filtered_10k_exp6_hybrid_nothink-GRPO_step224 4B • Updated 8 days ago • 14
zhangchenxu/RB-Qwen2.5-VL-3B-Instruct-vlr_syn_filtered_10k_exp6_hybrid_nothink-GRPO_step192 4B • Updated 8 days ago • 13
zhangchenxu/RB-Qwen2.5-VL-3B-Instruct-vlr_syn_filtered_10k_exp6_hybrid_nothink-GRPO_step192 4B • Updated 8 days ago • 13
zhangchenxu/RB-Qwen2.5-VL-3B-Instruct-vlr_syn_filtered_10k_exp6_hybrid_nothink-GRPO_step160 4B • Updated 8 days ago • 13
zhangchenxu/RB-Qwen2.5-VL-3B-Instruct-vlr_syn_filtered_10k_exp6_hybrid_nothink-GRPO_step160 4B • Updated 8 days ago • 13
zhangchenxu/RB-Qwen2.5-VL-3B-Instruct-vlr_syn_filtered_10k_exp6_hybrid_nothink-GRPO_step128 4B • Updated 8 days ago • 13
zhangchenxu/RB-Qwen2.5-VL-3B-Instruct-vlr_syn_filtered_10k_exp6_hybrid_nothink-GRPO_step128 4B • Updated 8 days ago • 13
zhangchenxu/RB-Qwen2.5-VL-3B-Instruct-vlr_syn_filtered_10k_exp6_hybrid_nothink-GRPO_step96 4B • Updated 8 days ago • 14
zhangchenxu/RB-Qwen2.5-VL-3B-Instruct-vlr_syn_filtered_10k_exp6_hybrid_nothink-GRPO_step96 4B • Updated 8 days ago • 14
zhangchenxu/RB-Qwen2.5-VL-3B-Instruct-vlr_syn_filtered_10k_exp6_hybrid_nothink-GRPO_step64 4B • Updated 8 days ago • 13
zhangchenxu/RB-Qwen2.5-VL-3B-Instruct-vlr_syn_filtered_10k_exp6_hybrid_nothink-GRPO_step64 4B • Updated 8 days ago • 13