li sheng's picture

5 1

li sheng

bambisheng

·

AI & ML interests

None yet

Recent Activity

upvoted a paper 17 days ago

SSRL: Self-Search Reinforcement Learning

upvoted a paper 3 months ago

The Entropy Mechanism of Reinforcement Learning for Reasoning Language Models

authored a paper 4 months ago

TTRL: Test-Time Reinforcement Learning

View all activity

Organizations

Collections 1

Papers 2

arxiv:2504.16084

arxiv:2502.04153

models 3

bambisheng/UltraIF-8B-DPO

Text Generation • 8B • Updated Apr 3 • 5 • 3

bambisheng/UltraIF-8B-UltraComposer

Text Generation • 8B • Updated Apr 3 • 5 • 1

bambisheng/UltraIF-8B-SFT

Text Generation • 8B • Updated Apr 3 • 7 • 2

datasets 0

None public yet