li sheng

bambisheng

AI & ML interests

None yet

Recent Activity

upvoted a paper 24 days ago

SSRL: Self-Search Reinforcement Learning

upvoted a paper 4 months ago

The Entropy Mechanism of Reinforcement Learning for Reasoning Language Models

authored a paper 5 months ago

TTRL: Test-Time Reinforcement Learning

View all activity

Organizations

upvoted a paper 24 days ago

SSRL: Self-Search Reinforcement Learning

Paper • 2508.10874 • Published 28 days ago • 92

upvoted a paper 4 months ago

The Entropy Mechanism of Reinforcement Learning for Reasoning Language Models

Paper • 2505.22617 • Published May 28 • 130

authored a paper 5 months ago

TTRL: Test-Time Reinforcement Learning

Paper • 2504.16084 • Published Apr 22 • 120

upvoted a paper 5 months ago

TTRL: Test-Time Reinforcement Learning

Paper • 2504.16084 • Published Apr 22 • 120

published 3 models 5 months ago

updated a model 5 months ago

bambisheng/UltraIF-8B-DPO

Text Generation • 8B • Updated Apr 3 • 6 • 3

updated a collection 5 months ago

UltraIF series

Collection

Open-Sourced model and data for ULTRAIF: Advancing Instruction Following from the Wild. • 6 items • Updated Apr 3 • 3

updated 2 models 5 months ago

bambisheng/UltraIF-8B-UltraComposer

Text Generation • 8B • Updated Apr 3 • 6 • 1

bambisheng/UltraIF-8B-SFT

Text Generation • 8B • Updated Apr 3 • 7 • 2

upvoted a paper 6 months ago

Technologies on Effectiveness and Efficiency: A Survey of State Spaces Models

Paper • 2503.11224 • Published Mar 14 • 29

authored a paper 7 months ago

UltraIF: Advancing Instruction Following from the Wild

Paper • 2502.04153 • Published Feb 6 • 24

liked a dataset 7 months ago

kkk-an/UltraIF-dpo-20k

Preview • Updated Feb 10 • 61 • 7

upvoted a paper 7 months ago

UltraIF: Advancing Instruction Following from the Wild

Paper • 2502.04153 • Published Feb 6 • 24

li sheng

AI & ML interests

Recent Activity

Organizations

bambisheng's activity