Joakim Lee's picture

61

Joakim Lee

Reinforcement4All

AI & ML interests

None yet

Recent Activity

upvoted a paper about 9 hours ago

Beyond the Trade-off: Self-Supervised Reinforcement Learning for Reasoning Models' Instruction Following

upvoted a paper about 9 hours ago

Qwen-Image Technical Report

upvoted a paper about 18 hours ago

Cognitive Kernel-Pro: A Framework for Deep Research Agents and Agent Foundation Models Training

View all activity

Organizations

None yet

upvoted 2 papers about 9 hours ago

Beyond the Trade-off: Self-Supervised Reinforcement Learning for Reasoning Models' Instruction Following

Paper • 2508.02150 • Published 1 day ago • 20

Qwen-Image Technical Report

Paper • 2508.02324 • Published 1 day ago • 79

upvoted a paper about 18 hours ago

Cognitive Kernel-Pro: A Framework for Deep Research Agents and Agent Foundation Models Training

Paper • 2508.00414 • Published 5 days ago • 57

upvoted 5 papers 5 days ago

MixGRPO: Unlocking Flow-based GRPO Efficiency with Mixed ODE-SDE

Paper • 2507.21802 • Published 7 days ago • 10

Step-3 is Large yet Affordable: Model-system Co-design for Cost-effective Decoding

Paper • 2507.19427 • Published 11 days ago • 16

Falcon-H1: A Family of Hybrid-Head Language Models Redefining Efficiency and Performance

Paper • 2507.22448 • Published 7 days ago • 59

Flow Equivariant Recurrent Neural Networks

Paper • 2507.14793 • Published 17 days ago • 2

Seed-Prover: Deep and Broad Reasoning for Automated Theorem Proving

Paper • 2507.23726 • Published 5 days ago • 95

upvoted 3 papers 6 days ago

A Survey of Self-Evolving Agents: On Path to Artificial Super Intelligence

Paper • 2507.21046 • Published 8 days ago • 72

Agentic Reinforced Policy Optimization

Paper • 2507.19849 • Published 11 days ago • 127

HunyuanWorld 1.0: Generating Immersive, Explorable, and Interactive 3D Worlds from Words or Pixels

Paper • 2507.21809 • Published 7 days ago • 111

upvoted 2 papers 8 days ago

Group Sequence Policy Optimization

Paper • 2507.18071 • Published 13 days ago • 267

GEPA: Reflective Prompt Evolution Can Outperform Reinforcement Learning

Paper • 2507.19457 • Published 11 days ago • 20

upvoted 5 papers 13 days ago

MegaScience: Pushing the Frontiers of Post-Training Datasets for Science Reasoning

Paper • 2507.16812 • Published 14 days ago • 57

Beyond Context Limits: Subconscious Threads for Long-Horizon Reasoning

Paper • 2507.16784 • Published 14 days ago • 113

The Serial Scaling Hypothesis

Paper • 2507.12549 • Published 20 days ago • 9

"PhyWorldBench": A Comprehensive Evaluation of Physical Realism in Text-to-Video Models

Paper • 2507.13428 • Published 19 days ago • 15

Being-H0: Vision-Language-Action Pretraining from Large-Scale Human Videos

Paper • 2507.15597 • Published 15 days ago • 33

upvoted a paper 15 days ago

MiroMind-M1: An Open-Source Advancement in Mathematical Reasoning via Context-Aware Multi-Stage Policy Optimization

Paper • 2507.14683 • Published 17 days ago • 122

upvoted a paper 19 days ago

The Imitation Game: Turing Machine Imitator is Length Generalizable Reasoner

Paper • 2507.13332 • Published 19 days ago • 48