1 213 734

Motoki Wu PRO

tokestermw

https://motoki.co

AI & ML interests

None yet

Recent Activity

upvoted a paper 1 day ago

R-4B: Incentivizing General-Purpose Auto-Thinking Capability in MLLMs via Bi-Mode Annealing and Reinforce Learning

upvoted a paper 7 days ago

Breaking the Exploration Bottleneck: Rubric-Scaffolded Reinforcement Learning for General LLM Reasoning

upvoted a paper 7 days ago

AgentFly: Fine-tuning LLM Agents without Fine-tuning LLMs

View all activity

Organizations

upvoted a paper 1 day ago

R-4B: Incentivizing General-Purpose Auto-Thinking Capability in MLLMs via Bi-Mode Annealing and Reinforce Learning

Paper • 2508.21113 • Published 5 days ago • 86

upvoted 2 papers 7 days ago

Breaking the Exploration Bottleneck: Rubric-Scaffolded Reinforcement Learning for General LLM Reasoning

Paper • 2508.16949 • Published 10 days ago • 21

AgentFly: Fine-tuning LLM Agents without Fine-tuning LLMs

Paper • 2508.16153 • Published 11 days ago • 121

liked a Space 8 days ago

532

Sheets

🗂

Create and enrich datasets using AI

liked a model 8 days ago

xai-org/grok-2

Updated 10 days ago • 4.26k • 899

liked a Space 13 days ago

179

Jupyter Agent 2

🏃

Run code and analyze data in a Jupyter notebook

liked a model 13 days ago

stepfun-ai/NextStep-1-Large-Edit

Image-to-Image • 15B • Updated 14 days ago • 695 • 47

upvoted a collection 14 days ago

NVIDIA Nemotron

Collection

Open, Production-ready Enterprise Models. Nvidia Open Model license. • 3 items • Updated 4 days ago • 55

liked 2 models 14 days ago

deepseek-ai/DeepSeek-V3.1-Base

Text Generation • 685B • Updated 7 days ago • 23.2k • 960

nvidia/NVIDIA-Nemotron-Nano-9B-v2

Text Generation • 9B • Updated 4 days ago • 70.3k • 295

upvoted a paper 15 days ago

SSRL: Self-Search Reinforcement Learning

Paper • 2508.10874 • Published 19 days ago • 91

liked a model 15 days ago

mistralai/Mistral-Small-3.2-24B-Instruct-2506

24B • Updated 12 days ago • 370k • 437

upvoted a paper 18 days ago

Pass@k Training for Adaptively Balancing Exploration and Exploitation of Large Reasoning Models

Paper • 2508.10751 • Published 19 days ago • 26

liked 2 models 18 days ago

hf-internal-testing/tiny-random-gpt2

0.0B • Updated Apr 8, 2024 • 776k • 8

google/gemma-3-270m-it

Text Generation • 0.3B • Updated 19 days ago • 162k • 370

liked 4 models 19 days ago

voyageai/voyage-context-3

Updated Jul 16 • 11

Snowflake/snowflake-arctic-embed-l-v2.0

Snowflake/snowflake-arctic-embed-l

jxm/gpt-oss-20b-base

Text Generation • 12B • Updated 13 days ago • 10.5k • 213

upvoted a paper 20 days ago

On the Generalization of SFT: A Reinforcement Learning Perspective with Reward Rectification

Paper • 2508.05629 • Published 26 days ago • 168